# Inworld AI

> Production-grade voice AI APIs offering top-ranked text-to-speech, speech-to-speech, speech-to-text, and LLM routing for developers building natural conversational applications.

Inworld AI provides production-grade voice AI APIs ranked #1 on the Artificial Analysis Speech Arena, offering realtime text-to-speech, speech-to-speech, speech-to-text, and intelligent LLM routing. The platform delivers sub-130ms first-chunk latency and supports over 100 languages, making it suitable for companions, agentic workforces, learning platforms, health and wellness apps, and interactive media. Developers access all capabilities through a unified API with SOC2 Type II, HIPAA, and GDPR compliance built in.

- **Realtime TTS** — *Top-ranked text-to-speech with sub-130ms latency, starting at $15/1M characters; supports voice cloning from 15 seconds of audio, text-based voice design, advanced inline voice direction, and cross-lingual output in 100+ languages.*
- **Realtime Speech-to-Speech API** — *End-to-end full-duplex audio streaming over WebSocket or WebRTC with custom voices, tool calling, intelligent turn detection, and dynamic context management mid-session.*
- **Realtime STT** — *Speech-to-text with real-time voice profiling (emotion, age, accent, pitch, style), semantic and acoustic VAD, word-level timestamps, speaker diarization, and custom vocabulary support.*
- **Realtime LLM Router** — *Single API that routes requests across OpenAI, Anthropic, Google, xAI, Groq, Mistral, and 200+ models with built-in failover, A/B testing, user-aware and context-aware routing, and no added latency.*
- **Voice Cloning & Design** — *Clone a voice from 15 seconds of audio or describe a voice in natural language to generate a production-ready custom voice without recording.*
- **Advanced Voice Direction** — *Add bracketed instructions anywhere in text to adjust tone, speed, volume, vocal style, and pauses in real time.*
- **Enterprise Security** — *SOC2 Type II certified, HIPAA compliant, GDPR compliant; optional zero data retention, on-prem deployment, and EU/India data residency available.*
- **Credit-Based Billing** — *Monthly credits usable across TTS, STT, and LLMs; higher tiers unlock volume discounts up to 40% off standard rates.*

## Features
- Realtime text-to-speech (TTS)
- Speech-to-speech API
- Speech-to-text (STT)
- LLM routing across 200+ models
- Voice cloning from 15 seconds of audio
- Text-based voice design
- Advanced inline voice direction
- Cross-lingual support (100+ languages)
- Full-duplex WebSocket/WebRTC streaming
- Intelligent turn detection
- Function calling mid-session
- Voice profiling (emotion, age, accent, pitch, style)
- Word-level timestamps and speaker diarization
- Custom vocabulary support
- User-aware and context-aware LLM routing
- Built-in A/B testing and failover
- SOC2 Type II, HIPAA, GDPR compliance
- Zero data retention (add-on)
- On-prem deployment (Enterprise)
- EU and India data residency (Enterprise)

## Integrations
OpenAI, Anthropic, Google, xAI, Groq, Mistral, WebSocket, WebRTC

## Platforms
API, WEB

## Pricing
Freemium — Free tier available with paid upgrades

## Links
- Website: https://inworld.ai
- Documentation: https://docs.inworld.ai/docs/introduction
- Repository: https://github.com/inworld-ai
- EveryDev.ai: https://www.everydev.ai/tools/inworld-ai