TEN Framework
An open-source framework for building real-time multimodal conversational AI agents with voice, video, and extensible plugin support.
At a Glance
About TEN Framework
TEN Framework is an open-source project maintained by the TEN-framework organization (backed by Agora) for building real-time multimodal conversational AI agents. It targets developers who want to compose voice and video AI pipelines from modular extensions without writing low-level real-time communication infrastructure from scratch. The repository has accumulated over 10,000 GitHub stars since its creation in June 2024, signaling strong community interest.
What It Is
TEN Framework provides the runtime, extension model, and tooling needed to assemble conversational AI agents that operate over real-time audio and video streams. Rather than a single monolithic agent, it is a composition framework: developers wire together speech-to-text (STT), large language model (LLM), and text-to-speech (TTS) extensions into a pipeline, then deploy that pipeline as a containerized service. The framework handles the low-latency transport layer (built on Agora RTC) so that extensions can focus on AI logic.
TEN Ecosystem
The framework ships as part of a broader ecosystem of related projects:
- TEN Framework — the core runtime and extension API (Apache 2.0 with additional conditions)
- TEN VAD — a low-latency streaming voice activity detector
- TEN Turn Detection — enables full-duplex dialogue by detecting speaker turns
- Agent Examples — a collection of ready-to-run use-case demos
- TEN Portal — the official documentation and blog site at theten.ai
Each component is a separate GitHub repository under the TEN-framework organization, allowing teams to adopt only the pieces they need.
Agent Examples and Use Cases
The repository ships several reference agent implementations that demonstrate the framework's range:
- Multi-Purpose Voice Assistant — low-latency assistant supporting both RTC and WebSocket connections, extensible with memory, VAD, and turn detection
- Doodler — converts spoken or typed prompts into real-time hand-drawn sketches
- Speaker Diarization — real-time speaker detection and labeling
- Lip Sync Avatars — integrates Live2D characters and realistic avatars from vendors including Trulience, HeyGen, and Tavus
- SIP Call — phone-call integration via a SIP extension
- Transcription — audio-to-text transcription pipeline
- ESP32-S3 Korvo V3 — runs a TEN agent on embedded hardware for LLM-powered IoT communication
Setup Path
The recommended local development path uses Docker Compose and Node.js (LTS v18). Prerequisites include an Agora App ID and App Certificate, plus API keys for OpenAI (LLM), Deepgram (ASR), and ElevenLabs (TTS). Minimum system requirements are 2 CPU cores and 4 GB RAM. After cloning and configuring a .env file, developers run docker compose up, enter the container, and build one of the example agents. A TMAN Designer UI at localhost:49483 provides a visual interface for wiring extensions, while the agent UI runs at localhost:3000. GitHub Codespaces is also supported for zero-install experimentation.
License and Tradeoffs
The framework is released under Apache License 2.0 with additional conditions imposed by Agora. Specifically, the license prohibits hosting TEN on end-user devices (including mobile) and prohibits deploying it in ways that compete with Agora's offerings. Developers building their own applications for their own end users are explicitly permitted. The packages directory components use a plain Apache 2.0 license. This hybrid licensing model means TEN is source-available and free for most application builders, but is not a fully permissive open-source license in the OSI sense.
Update: Version 0.11.66
The latest release is version 0.11.66, published on May 26, 2026. The repository shows active monthly commit activity and 213 open issues as of mid-2026, indicating ongoing development. The project direction continues to expand the extension ecosystem and agent example library, with community contributions welcomed via GitHub Issues and Projects.
Community Discussions
Be the first to start a conversation about TEN Framework
Share your experience with TEN Framework, ask questions, or help others learn from your insights.
Pricing
Open Source
Free to use under Apache 2.0 with additional Agora conditions. Self-hosted.
- Full framework source code
- All agent examples
- TEN VAD and Turn Detection
- TMAN Designer UI
- Docker-based deployment
Capabilities
Key Features
- Real-time multimodal conversational AI runtime
- Modular extension model for STT, LLM, and TTS pipelines
- Built-in Agora RTC transport for low-latency audio/video
- Voice Activity Detection (TEN VAD)
- Full-duplex turn detection
- Multi-Purpose Voice Assistant example
- Speaker diarization support
- Lip sync avatar integration (Live2D, Trulience, HeyGen, Tavus)
- SIP call extension for phone integration
- Transcription pipeline
- ESP32-S3 embedded hardware support
- TMAN Designer visual UI for wiring extensions
- Docker Compose-based local development
- GitHub Codespaces support
- WebSocket and RTC connection modes
- Memory extension support
- Self-hosting via Docker or cloud services (Vercel, Netlify, Fly.io, etc.)
