The Token Company

Name: The Token Company
Availability: OnlineOnly
Author: The Token Company

A prompt compression API that removes context bloat from LLM inputs, reducing token costs and improving accuracy with a simple drop-in middleware integration.

Visit Website

At a Glance

Pricing

Free tier available

Free to try with no credit card required. Pay only for tokens compressed.

Usage-Based: $0 usage-based

Engagement

Available On

Web

API

SDK

The Token CompanySan Francisco, CAEst. 2025$72.2M raised

Listed Mar 2026

About The Token Company

The Token Company provides a prompt compression API that removes semantic redundancy and context bloat from LLM inputs before they reach your model. Using their bear-1.x model family, developers can reduce token counts by up to 75%, cutting LLM costs dramatically while simultaneously improving accuracy and reducing latency. The API integrates in minutes as drop-in middleware with a single POST call, and benchmarks show measurable improvements on real-world financial documents and reading comprehension tasks.

bear-1.x Compression Models: Use bear-1, bear-1.1, or bear-1.2 (recommended) to semantically compress prompts while preserving intent and logical relationships.
Usage-Based Pricing: Pay only $0.05 per 1M tokens removed — you are never charged for tokens that remain in the output.
One-Call Integration: Send text to POST api.thetokencompany.com/v1/compress with your API key and receive compressed text back; drop it in before any LLM call.
Adjustable Aggressiveness: Control compression intensity with a aggressiveness parameter from 0.0 to 1.0 to balance compression ratio vs. fidelity.
Protected Tokens: Wrap sensitive or critical text in <ttc_safe> tags to prevent those sections from being compressed.
Gzip Support: Enable gzip encoding on requests for up to 2.5x faster large-payload transfers; enabled by default in the Python SDK and npm package.
Python SDK & npm Package: Get started quickly with official SDKs that handle authentication, gzip, and response parsing out of the box.
Proven Benchmarks: Compression improved SEC filing QA accuracy by 2.7pp with 20% fewer tokens, SQuAD 2.0 accuracy by 4.0pp with 17% fewer tokens, and reduced E2E latency by up to 37% on Claude Opus.
Chat & Document Use Cases: Expand conversation history 3x within the same context window, or process large PDFs and web scrapes without bloated inputs.

Community Discussions

Be the first to start a conversation about The Token Company

Share your experience with The Token Company, ask questions, or help others learn from your insights.

Pricing

FREE

Free to Try

Free to try with no credit card required. Pay only for tokens compressed.

Access to bear-1, bear-1.1, bear-1.2 models
No credit card required to start
$0.05 per 1M compressed tokens after free usage

Usage-Based

Pay-as-you-go at $0.05 per 1M tokens removed (compressed tokens). No flat monthly fee.

usage based

Access to all bear-1.x models
$0.05 per 1M compressed (removed) tokens
Only pay for tokens actually removed
Python SDK and npm package
Gzip compression support
Protected tokens via <ttc_safe> tags
Adjustable aggressiveness parameter

View official pricing

Capabilities

Key Features

Prompt compression via bear-1, bear-1.1, bear-1.2 models
Usage-based pricing at $0.05 per 1M compressed tokens
Single POST API endpoint for drop-in middleware integration
Adjustable compression aggressiveness (0.0–1.0)
Protected tokens via <ttc_safe> tags
Gzip compression support for faster large payloads
Python SDK and npm package
Token count reporting (input vs. output)
Real-world benchmarks on financial and reading comprehension tasks
Infinite chat history demo

Integrations

OpenAI GPT

Anthropic Claude

Google Gemini

OpenRouter

Any LLM API

API Available

View Docs

Back to all tools

About The Token Company

bear-1.x Compression Models: Use bear-1, bear-1.1, or bear-1.2 (recommended) to semantically compress prompts while preserving intent and logical relationships.
Usage-Based Pricing: Pay only $0.05 per 1M tokens removed — you are never charged for tokens that remain in the output.
One-Call Integration: Send text to POST api.thetokencompany.com/v1/compress with your API key and receive compressed text back; drop it in before any LLM call.
Adjustable Aggressiveness: Control compression intensity with a aggressiveness parameter from 0.0 to 1.0 to balance compression ratio vs. fidelity.
Protected Tokens: Wrap sensitive or critical text in <ttc_safe> tags to prevent those sections from being compressed.
Gzip Support: Enable gzip encoding on requests for up to 2.5x faster large-payload transfers; enabled by default in the Python SDK and npm package.
Python SDK & npm Package: Get started quickly with official SDKs that handle authentication, gzip, and response parsing out of the box.
Proven Benchmarks: Compression improved SEC filing QA accuracy by 2.7pp with 20% fewer tokens, SQuAD 2.0 accuracy by 4.0pp with 17% fewer tokens, and reduced E2E latency by up to 37% on Claude Opus.
Chat & Document Use Cases: Expand conversation history 3x within the same context window, or process large PDFs and web scrapes without bloated inputs.

Community Discussions

Be the first to start a conversation about The Token Company

Share your experience with The Token Company, ask questions, or help others learn from your insights.

Pricing

FREE

Free to Try

Free to try with no credit card required. Pay only for tokens compressed.

Access to bear-1, bear-1.1, bear-1.2 models
No credit card required to start
$0.05 per 1M compressed tokens after free usage

Usage-Based

Pay-as-you-go at $0.05 per 1M tokens removed (compressed tokens). No flat monthly fee.

usage based

Access to all bear-1.x models
$0.05 per 1M compressed (removed) tokens
Only pay for tokens actually removed
Python SDK and npm package
Gzip compression support
Protected tokens via <ttc_safe> tags
Adjustable aggressiveness parameter

View official pricing

Capabilities

Key Features

Prompt compression via bear-1, bear-1.1, bear-1.2 models
Usage-based pricing at $0.05 per 1M compressed tokens
Single POST API endpoint for drop-in middleware integration
Adjustable compression aggressiveness (0.0–1.0)
Protected tokens via <ttc_safe> tags
Gzip compression support for faster large payloads
Python SDK and npm package
Token count reporting (input vs. output)
Real-world benchmarks on financial and reading comprehension tasks
Infinite chat history demo

Integrations

OpenAI GPT

Anthropic Claude

Google Gemini

OpenRouter

Any LLM API

API Available

View Docs

The Token Company

At a Glance

Engagement

Available On

Resources

Topics

Alternatives

About The Token Company

Community Discussions

Be the first to start a conversation about The Token Company

Pricing

Free to Try

Usage-Based

Capabilities

Key Features

Integrations

The Token Company

At a Glance

Engagement

Available On

Resources

Topics

Alternatives

About The Token Company

Community Discussions

Be the first to start a conversation about The Token Company

Pricing

Free to Try

Usage-Based

Capabilities

Key Features

Integrations