Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • News
  • Blogs
  • Builds
  • Contests
Create
Sign In
    EveryDev.ai
    Sign inSubscribe
    Home
    Tools

    1,616+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    Categories
    • Coding733
    • Agents640
    • Marketing302
    • Infrastructure298
    • Design239
    • Analytics228
    • Research224
    • Projects207
    • Integration148
    • Testing129
    • Data125
    • Learning115
    • MCP113
    • Security107
    • Extensions94
    • Prompts79
    • Communication73
    • Voice71
    • Commerce70
    • Web59
    • DevOps46
    • Finance12
    Sign In
    1. Home
    2. Tools
    3. Compresr
    Compresr icon

    Compresr

    Context Engineering

    Compresr compresses LLM context to reduce token costs, improve accuracy, and cut latency in AI pipelines using query-aware and query-agnostic compression models.

    Visit Website

    At a Glance

    Pricing

    Paid

    Espresso V1: $0.1
    Latte V1: $0.25
    Coldbrew V1: $0.25

    Engagement

    Available On

    macOS
    Web
    API

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Context EngineeringLLM OrchestrationRetrieval-Augmented Generation

    Listed Mar 2026

    About Compresr

    Compresr is a context compression API for LLM pipelines that reduces token usage by up to 200x without quality loss. It offers multiple compression models — from query-agnostic pre-compression to aggressive query-specific filtering — enabling developers to cut costs, reduce latency, and improve accuracy in RAG, search, Q&A, and agent workflows. Backed by Y Combinator (W26), Compresr also provides an open-source Context Gateway for agent frameworks like Claude Code and OpenClaw.

    • Espresso V1 — Query-agnostic token-level compression; pre-compress long documents, system prompts, or agent histories once and reuse across multiple queries.
    • Latte V1 — Query-specific token-level compression that retains only tokens relevant to a given question, enabling up to 200x compression; ideal for RAG, search, and Q&A pipelines.
    • Coldbrew V1 — Query-specific chunk-level filtering that drops entire irrelevant chunks before they reach the model; ideal for structured data like transcripts or logs.
    • Context Gateway — Open-source proxy for agents that compresses conversation history, tool outputs, and tool lists; compatible with Claude Code, OpenClaw, Codex, and more.
    • Coarse-grained filtering — Pass a query and a list of chunks to retrieve only the relevant ones, reducing context before fine-grained compression.
    • Fine-grained compression — Token-level compression given a query and context, stripping irrelevant tokens for maximum efficiency.
    • Usage-based pricing — Pay per million tokens processed, with no upfront commitment; get started by signing up and calling the API.
    • Demo environment — Try compression live on sample datasets like SEC filings directly from the dashboard without writing any code.
    Compresr - 1

    Community Discussions

    Be the first to start a conversation about Compresr

    Share your experience with Compresr, ask questions, or help others learn from your insights.

    Pricing

    Espresso V1

    Query-agnostic token-level compression. Pre-compress long documents, system prompts, or agent histories once and reuse across multiple queries.

    $0.1
    usage based
    • Query-agnostic token-level compression
    • Pre-compress long documents
    • Compress system prompts
    • Compress agent histories
    • Reuse compressed context across multiple queries

    Latte V1

    Query-specific token-level compression. Retains only tokens relevant to a given question, enabling aggressive (up to 200x) compression. Ideal for RAG, search, and Q&A pipelines.

    $0.25
    usage based
    • Query-specific token-level compression
    • Up to 200x compression ratio
    • Ideal for RAG pipelines
    • Ideal for search and Q&A workflows

    Coldbrew V1

    Query-specific chunk-level filtering. Drops entire irrelevant chunks before they reach the model — ideal for structured data like transcripts or logs.

    $0.25
    usage based
    • Query-specific chunk-level filtering
    • Drops irrelevant chunks before model inference
    • Ideal for transcripts and logs
    • No retrieval index required
    View official pricing

    Capabilities

    Key Features

    • Token-level context compression
    • Chunk-level context filtering
    • Query-agnostic pre-compression
    • Query-specific compression
    • Up to 200x compression ratio
    • Context Gateway for agents
    • Conversation history compression
    • Tool output compression
    • RAG pipeline optimization
    • Open-source proxy gateway
    • REST API access
    • Dashboard demo environment

    Integrations

    Claude Code
    OpenClaw
    Codex
    GPT models
    RAG pipelines
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate Compresr and help others make informed decisions.

    Developer

    Compresr Team

    Compresr builds context compression infrastructure for LLM pipelines, helping developers cut token costs and improve model accuracy. The team, backed by Y Combinator (W26), ships both a managed API and an open-source Context Gateway for agent frameworks. Compresr's models achieve up to 200x compression without quality loss, making AI workflows faster and cheaper at scale.

    Founded 2026
    San Francisco, CA
    $500 raised
    4 employees

    Used by

    Confidential customer (mentioned in YC…
    Read more about Compresr Team
    WebsiteGitHubLinkedInX / Twitter
    1 tool in directory

    Similar Tools

    Context-Gateway icon

    Context-Gateway

    An open-source context gateway for AI applications that manages and compresses context to optimize LLM token usage and reduce costs.

    Hyperspell icon

    Hyperspell

    Memory and context layer for AI agents that connects to user data sources for automatic memory and context-aware responses.

    LangMem icon

    LangMem

    Open-source SDK from LangChain for long-term memory in LLM agents, with hot-path tools, a background memory manager, and native LangGraph storage integration.

    Browse all tools

    Related Topics

    Context Engineering

    Techniques for optimizing context windows to improve AI responses.

    23 tools

    LLM Orchestration

    Platforms and frameworks for designing, managing, and deploying complex LLM workflows with visual interfaces, allowing for the coordination of multiple AI models and services.

    60 tools

    Retrieval-Augmented Generation

    RAG Systems that enhance LLM outputs by retrieving relevant information from external knowledge bases, combining the power of generative AI with information retrieval for more accurate and contextual responses.

    40 tools
    Browse all topics
    Back to all tools
    Explore AI Tools
    • AI Coding Assistants
    • Agent Frameworks
    • MCP Servers
    • AI Prompt Tools
    • Vibe Coding Tools
    • AI Design Tools
    • AI Database Tools
    • AI Website Builders
    • AI Testing Tools
    • LLM Evaluations
    Follow Us
    • X / Twitter
    • LinkedIn
    • Reddit
    • Discord
    • Threads
    • Bluesky
    • Mastodon
    • YouTube
    • GitHub
    • Instagram
    Get Started
    • About
    • Editorial Standards
    • Corrections & Disclosures
    • Community Guidelines
    • Advertise
    • Contact Us
    • Newsletter
    • Submit a Tool
    • Start a Discussion
    • Write A Blog
    • Share A Build
    • Terms of Service
    • Privacy Policy
    Explore with AI
    • ChatGPT
    • Gemini
    • Claude
    • Grok
    • Perplexity
    Agent Experience
    • llms.txt
    Theme
    With AI, Everyone is a Dev. EveryDev.ai © 2026
    Sign in
    2views
    0upvotes
    0discussions