# Compresr > Compresr compresses LLM context to reduce token costs, improve accuracy, and cut latency in AI pipelines using query-aware and query-agnostic compression models. Compresr is a context compression API for LLM pipelines that reduces token usage by up to 200x without quality loss. It offers multiple compression models — from query-agnostic pre-compression to aggressive query-specific filtering — enabling developers to cut costs, reduce latency, and improve accuracy in RAG, search, Q&A, and agent workflows. Backed by Y Combinator (W26), Compresr also provides an open-source Context Gateway for agent frameworks like Claude Code and OpenClaw. - **Espresso V1** — *Query-agnostic token-level compression; pre-compress long documents, system prompts, or agent histories once and reuse across multiple queries.* - **Latte V1** — *Query-specific token-level compression that retains only tokens relevant to a given question, enabling up to 200x compression; ideal for RAG, search, and Q&A pipelines.* - **Coldbrew V1** — *Query-specific chunk-level filtering that drops entire irrelevant chunks before they reach the model; ideal for structured data like transcripts or logs.* - **Context Gateway** — *Open-source proxy for agents that compresses conversation history, tool outputs, and tool lists; compatible with Claude Code, OpenClaw, Codex, and more.* - **Coarse-grained filtering** — *Pass a query and a list of chunks to retrieve only the relevant ones, reducing context before fine-grained compression.* - **Fine-grained compression** — *Token-level compression given a query and context, stripping irrelevant tokens for maximum efficiency.* - **Usage-based pricing** — *Pay per million tokens processed, with no upfront commitment; get started by signing up and calling the API.* - **Demo environment** — *Try compression live on sample datasets like SEC filings directly from the dashboard without writing any code.* ## Features - Token-level context compression - Chunk-level context filtering - Query-agnostic pre-compression - Query-specific compression - Up to 200x compression ratio - Context Gateway for agents - Conversation history compression - Tool output compression - RAG pipeline optimization - Open-source proxy gateway - REST API access - Dashboard demo environment ## Integrations Claude Code, OpenClaw, Codex, GPT models, RAG pipelines ## Platforms MACOS, WEB, API ## Pricing Open Source, Paid ## Links - Website: https://compresr.ai - Documentation: https://compresr.ai/docs - Repository: https://github.com/Compresr-ai/Context-Gateway - EveryDev.ai: https://www.everydev.ai/tools/compresr