Pipecat
An open-source Python framework for building real-time voice and multimodal conversational AI agents with composable pipelines and ultra-low latency.
At a Glance
About Pipecat
Pipecat is an open-source Python framework for building real-time voice and multimodal conversational agents. It orchestrates audio, video, AI services, and transport layers into composable pipelines, letting developers focus on what makes their agent unique rather than infrastructure plumbing. Licensed under BSD 2-Clause, Pipecat supports a wide ecosystem of speech, LLM, and transport integrations out of the box.
- Voice-first pipeline architecture — install via
pip install pipecat-aioruv add pipecat-aiand wire together STT, LLM, and TTS services in minutes - Extensive service integrations — supports 20+ STT providers (Deepgram, AssemblyAI, OpenAI Whisper, Azure, Google, etc.), 20+ LLMs (OpenAI, Anthropic, Gemini, Groq, Mistral, etc.), and 30+ TTS engines (ElevenLabs, Cartesia, AWS, Azure, etc.)
- Real-time transport options — connect via WebRTC (Daily, LiveKit), WebSockets, FastAPI, or WhatsApp for ultra-low latency interactions
- Speech-to-speech support — integrates OpenAI Realtime, Gemini Multimodal Live, AWS Nova Sonic, and Ultravox for end-to-end voice pipelines
- Multi-agent systems — use Pipecat Subagents to build distributed agents that communicate through a shared message bus and hand off conversations between specialists
- Structured conversations — Pipecat Flows manages complex conversational states and transitions for guided dialog systems
- Client SDKs for every platform — official SDKs for JavaScript, React, React Native, Swift (iOS), Kotlin (Android), C++, and ESP32
- Audio processing utilities — built-in VAD (Silero), noise cancellation (Krisp Viva, RNNoise, Koala), and audio filtering
- Observability and analytics — integrates OpenTelemetry and Sentry for pipeline monitoring and error tracking
- CLI tooling — use
pipecat init quickstartor the Pipecat CLI to scaffold, monitor, and deploy agents to production - Debugging tools — Whisker provides real-time pipeline debugging; Tail offers a terminal dashboard for live monitoring
- Community integrations — browse and contribute service integrations via the community integrations registry
Community Discussions
Be the first to start a conversation about Pipecat
Share your experience with Pipecat, ask questions, or help others learn from your insights.
Pricing
OPEN SOURCE
Open Source
Free to use, modify, and distribute under the BSD 2-Clause license.
- Full framework source code
- All service integrations
- Client SDKs (JS, React, iOS, Android, C++)
- CLI tooling
- Community support via Discord
Capabilities
Key Features
- Real-time voice and multimodal AI agent pipelines
- Composable modular pipeline architecture
- 20+ STT provider integrations
- 20+ LLM integrations
- 30+ TTS engine integrations
- Speech-to-speech support (OpenAI Realtime, Gemini Multimodal Live, etc.)
- WebRTC and WebSocket transport layers
- Multi-agent systems via Pipecat Subagents
- Structured conversation flows
- Client SDKs for JS, React, iOS, Android, C++
- Audio processing: VAD, noise cancellation
- OpenTelemetry and Sentry observability
- CLI for project scaffolding and deployment
- Real-time debugger (Whisker)
- Terminal dashboard (Tail)
- Community integrations registry
Integrations
Deepgram
AssemblyAI
OpenAI
Anthropic
Google Gemini
Azure
AWS
ElevenLabs
Cartesia
Groq
Mistral
Daily
LiveKit
Twilio
Telnyx
Vonage
Plivo
Genesys
HeyGen
Tavus
Simli
mem0
Silero VAD
Krisp Viva
RNNoise
OpenTelemetry
Sentry
WhatsApp
Ollama
Fireworks AI
Together AI
Perplexity
OpenRouter
DeepSeek
Cerebras
NVIDIA NIM
Ultravox
fal
Moondream
API Available
View Docs