# Compresr

> Compresr compresses LLM context to reduce token costs, improve accuracy, and cut latency in AI pipelines using query-aware and query-agnostic compression models.

Compresr is a context compression API for LLM pipelines that reduces token usage by up to 200x without quality loss. It offers multiple compression models — from query-agnostic pre-compression to aggressive query-specific filtering — enabling developers to cut costs, reduce latency, and improve accuracy in RAG, search, Q&A, and agent workflows. Backed by Y Combinator (W26), Compresr also provides an open-source Context Gateway for agent frameworks like Claude Code and OpenClaw.

- **Espresso V1** — *Query-agnostic token-level compression; pre-compress long documents, system prompts, or agent histories once and reuse across multiple queries.*
- **Latte V1** — *Query-specific token-level compression that retains only tokens relevant to a given question, enabling up to 200x compression; ideal for RAG, search, and Q&A pipelines.*
- **Coldbrew V1** — *Query-specific chunk-level filtering that drops entire irrelevant chunks before they reach the model; ideal for structured data like transcripts or logs.*
- **Context Gateway** — *Open-source proxy for agents that compresses conversation history, tool outputs, and tool lists; compatible with Claude Code, OpenClaw, Codex, and more.*
- **Coarse-grained filtering** — *Pass a query and a list of chunks to retrieve only the relevant ones, reducing context before fine-grained compression.*
- **Fine-grained compression** — *Token-level compression given a query and context, stripping irrelevant tokens for maximum efficiency.*
- **Usage-based pricing** — *Pay per million tokens processed, with no upfront commitment; get started by signing up and calling the API.*
- **Demo environment** — *Try compression live on sample datasets like SEC filings directly from the dashboard without writing any code.*

## Features
- Token-level context compression
- Chunk-level context filtering
- Query-agnostic pre-compression
- Query-specific compression
- Up to 200x compression ratio
- Context Gateway for agents
- Conversation history compression
- Tool output compression
- RAG pipeline optimization
- Open-source proxy gateway
- REST API access
- Dashboard demo environment

## Integrations
Claude Code, OpenClaw, Codex, GPT models, RAG pipelines

## Platforms
MACOS, WEB, API

## Pricing
Open Source, Paid

## Links
- Website: https://compresr.ai
- Documentation: https://compresr.ai/docs
- Repository: https://github.com/Compresr-ai/Context-Gateway
- EveryDev.ai: https://www.everydev.ai/tools/compresr