The Token Company
A prompt compression API that removes context bloat from LLM inputs, reducing token costs and improving accuracy with a simple drop-in middleware integration.
At a Glance
Pricing
Free to try with no credit card required. Pay only for tokens compressed.
Engagement
Available On
Listed Mar 2026
About The Token Company
The Token Company provides a prompt compression API that removes semantic redundancy and context bloat from LLM inputs before they reach your model. Using their bear-1.x model family, developers can reduce token counts by up to 75%, cutting LLM costs dramatically while simultaneously improving accuracy and reducing latency. The API integrates in minutes as drop-in middleware with a single POST call, and benchmarks show measurable improvements on real-world financial documents and reading comprehension tasks.
- bear-1.x Compression Models: Use
bear-1,bear-1.1, orbear-1.2(recommended) to semantically compress prompts while preserving intent and logical relationships. - Usage-Based Pricing: Pay only $0.05 per 1M tokens removed — you are never charged for tokens that remain in the output.
- One-Call Integration: Send text to
POST api.thetokencompany.com/v1/compresswith your API key and receive compressed text back; drop it in before any LLM call. - Adjustable Aggressiveness: Control compression intensity with a
aggressivenessparameter from 0.0 to 1.0 to balance compression ratio vs. fidelity. - Protected Tokens: Wrap sensitive or critical text in
<ttc_safe>tags to prevent those sections from being compressed. - Gzip Support: Enable gzip encoding on requests for up to 2.5x faster large-payload transfers; enabled by default in the Python SDK and npm package.
- Python SDK & npm Package: Get started quickly with official SDKs that handle authentication, gzip, and response parsing out of the box.
- Proven Benchmarks: Compression improved SEC filing QA accuracy by 2.7pp with 20% fewer tokens, SQuAD 2.0 accuracy by 4.0pp with 17% fewer tokens, and reduced E2E latency by up to 37% on Claude Opus.
- Chat & Document Use Cases: Expand conversation history 3x within the same context window, or process large PDFs and web scrapes without bloated inputs.
Community Discussions
Be the first to start a conversation about The Token Company
Share your experience with The Token Company, ask questions, or help others learn from your insights.
Pricing
Free Plan Available
Free to try with no credit card required. Pay only for tokens compressed.
- Access to bear-1, bear-1.1, bear-1.2 models
- No credit card required to start
- $0.05 per 1M compressed tokens after free usage
Usage-Based
Pay-as-you-go at $0.05 per 1M tokens removed (compressed tokens). No flat monthly fee.
- Access to all bear-1.x models
- $0.05 per 1M compressed (removed) tokens
- Only pay for tokens actually removed
- Python SDK and npm package
- Gzip compression support
- Protected tokens via <ttc_safe> tags
- Adjustable aggressiveness parameter
Capabilities
Key Features
- Prompt compression via bear-1, bear-1.1, bear-1.2 models
- Usage-based pricing at $0.05 per 1M compressed tokens
- Single POST API endpoint for drop-in middleware integration
- Adjustable compression aggressiveness (0.0–1.0)
- Protected tokens via <ttc_safe> tags
- Gzip compression support for faster large payloads
- Python SDK and npm package
- Token count reporting (input vs. output)
- Real-world benchmarks on financial and reading comprehension tasks
- Infinite chat history demo
