# The Token Company

> A prompt compression API that removes context bloat from LLM inputs, reducing token costs and improving accuracy with a simple drop-in middleware integration.

The Token Company provides a prompt compression API that removes semantic redundancy and context bloat from LLM inputs before they reach your model. Using their bear-1.x model family, developers can reduce token counts by up to 75%, cutting LLM costs dramatically while simultaneously improving accuracy and reducing latency. The API integrates in minutes as drop-in middleware with a single POST call, and benchmarks show measurable improvements on real-world financial documents and reading comprehension tasks.

- **bear-1.x Compression Models**: *Use `bear-1`, `bear-1.1`, or `bear-1.2` (recommended) to semantically compress prompts while preserving intent and logical relationships.*
- **Usage-Based Pricing**: *Pay only $0.05 per 1M tokens removed — you are never charged for tokens that remain in the output.*
- **One-Call Integration**: *Send text to `POST api.thetokencompany.com/v1/compress` with your API key and receive compressed text back; drop it in before any LLM call.*
- **Adjustable Aggressiveness**: *Control compression intensity with a `aggressiveness` parameter from 0.0 to 1.0 to balance compression ratio vs. fidelity.*
- **Protected Tokens**: *Wrap sensitive or critical text in `<ttc_safe>` tags to prevent those sections from being compressed.*
- **Gzip Support**: *Enable gzip encoding on requests for up to 2.5x faster large-payload transfers; enabled by default in the Python SDK and npm package.*
- **Python SDK & npm Package**: *Get started quickly with official SDKs that handle authentication, gzip, and response parsing out of the box.*
- **Proven Benchmarks**: *Compression improved SEC filing QA accuracy by 2.7pp with 20% fewer tokens, SQuAD 2.0 accuracy by 4.0pp with 17% fewer tokens, and reduced E2E latency by up to 37% on Claude Opus.*
- **Chat & Document Use Cases**: *Expand conversation history 3x within the same context window, or process large PDFs and web scrapes without bloated inputs.*

## Features
- Prompt compression via bear-1, bear-1.1, bear-1.2 models
- Usage-based pricing at $0.05 per 1M compressed tokens
- Single POST API endpoint for drop-in middleware integration
- Adjustable compression aggressiveness (0.0–1.0)
- Protected tokens via <ttc_safe> tags
- Gzip compression support for faster large payloads
- Python SDK and npm package
- Token count reporting (input vs. output)
- Real-world benchmarks on financial and reading comprehension tasks
- Infinite chat history demo

## Integrations
OpenAI GPT, Anthropic Claude, Google Gemini, OpenRouter, Any LLM API

## Platforms
WEB, API, DEVELOPER_SDK

## Pricing
Open Source, Free tier available

## Version
bear-1.2

## Links
- Website: https://thetokencompany.com
- Documentation: https://thetokencompany.com/docs
- Repository: https://github.com/TheTokenCompany
- EveryDev.ai: https://www.everydev.ai/tools/the-token-company