# Inception Labs > Diffusion-based large language models that generate tokens in parallel, delivering 5x faster inference with best-in-class quality at lower cost. Inception Labs builds and deploys next-generation large language models (LLMs) powered by diffusion rather than traditional auto-regressive generation. By using diffusion, their Mercury models produce many tokens in parallel, making them several times faster and less than half the cost of conventional LLMs. The diffusion framework provides fine-grained control over outputs, allowing adherence to specific schemas and semantic constraints while offering a unified paradigm for combining language with other data modalities. - **Parallel Token Generation** enables Mercury models to generate multiple tokens simultaneously instead of one at a time, resulting in blazing-fast inference speeds that are 5x faster than traditional LLMs. - **Mercury 2 Reasoning Model** is the fastest reasoning LLM and the first reasoning diffusion LLM, ideal for complex applications where both performance and speed are crucial. - **Mercury Edit** is a small, coding-focused diffusion LLM designed for code editing and extremely latency-sensitive components of coding workflows. - **OpenAI API Compatible** means Mercury models integrate seamlessly into existing LLM workflows as a drop-in replacement with minimal code changes. - **Enterprise-Grade Deployment** options include Inception API, AWS Bedrock, Azure Foundry, and model routers like OpenRouter, with configurable data retention, private networking, and custom SLAs. - **Real-Time Voice Applications** enable natural AI engagement in voice-powered workflows like customer support, translation, and immersive gaming experiences. - **Lightning Fast Agents** automate complex coding and business workflows with ultra-responsive AI that stays in flow without interrupting user thinking. - **Cost-Effective Pricing** at $0.25 per 1M input tokens and $0.75 per 1M output tokens makes high-performance AI accessible for production applications. To get started, request early access through the Inception website or access Mercury through AWS Bedrock, Azure Foundry, or model routers. The API is OpenAI-compatible, requiring only a one-line code change for integration. Documentation is available at docs.inceptionlabs.ai for detailed implementation guidance. ## Features - Parallel token generation - Diffusion-based language models - Mercury 2 reasoning model - Mercury Edit coding model - OpenAI API compatible - Real-time voice applications - Lightning fast agents - Instant code editing - Rapid search capabilities - Enterprise-grade privacy - AWS Bedrock integration - Azure Foundry integration - Custom SLAs - No training on customer data - Configurable data retention ## Integrations AWS Bedrock, Azure Foundry, OpenRouter, Poe, OpenAI API ## Platforms WEB, API ## Pricing Paid ## Links - Website: https://www.inceptionlabs.ai - Documentation: https://docs.inceptionlabs.ai/get-started/get-started - EveryDev.ai: https://www.everydev.ai/tools/inception-labs