AssemblyAI
Speech-to-text and speech understanding API platform for building Voice AI applications with industry-leading accuracy.
At a Glance
Pricing
Access to industry-leading Speech-to-Text and Audio Intelligence models
Engagement
Available On
About AssemblyAI
AssemblyAI provides industry-leading speech-to-text and speech understanding models that power Voice AI applications for thousands of companies worldwide. The platform enables developers to transcribe audio and video files, process live streaming audio, and extract insights from voice data with exceptional accuracy and low latency. AssemblyAI processes over 40 terabytes of audio daily and serves 600M+ inference calls per month.
-
Speech-to-Text converts pre-recorded audio and video files into accurate transcripts with support for 99 languages, speaker diarization, automatic punctuation, and word-level timestamps.
-
Streaming Speech-to-Text enables real-time transcription with ultra-low latency for voice agents and live applications, featuring built-in turn detection and unlimited concurrency.
-
Speech Understanding provides audio intelligence capabilities including sentiment analysis, entity detection, topic detection, auto chapters, summarization, and speaker identification.
-
LLM Gateway unifies voice-to-intelligence workflows by applying Large Language Models directly to audio content through a single API.
-
Guardrails ensures content safety with profanity filtering, PII redaction from both text and audio, and content moderation for sensitive topics.
-
Enterprise Features include SOC 2 Type 2, ISO 27001, HIPAA compliance with BAA, EU data residency, self-hosted deployments, and dedicated support.
To get started, sign up for a free account with $50 in credits to access the API. Install the SDK for Python, JavaScript, or other supported languages, then submit audio files or connect streaming audio to receive transcripts. The playground allows testing models without code. Enterprise customers can contact sales for volume discounts, custom deployments, and dedicated infrastructure options.

Community Discussions
Be the first to start a conversation about AssemblyAI
Share your experience with AssemblyAI, ask questions, or help others learn from your insights.
Pricing
Free Plan Available
Access to industry-leading Speech-to-Text and Audio Intelligence models
- Transcribe up to 185 hours of pre-recorded audio for free
- Transcribe up to 333 hours of streaming audio for free
- Up to 5 new streams per minute
- Developer docs, community support, and resources
Pay as you go - Universal
Fast, accurate transcription across 99 languages
- Unlimited access to Speech-to-Text, Audio Intelligence, and LeMUR
- Unlimited concurrent streams
- Pre-recorded concurrency starting at 200 files
- Customize rate limits
- Dedicated technical support
Pay as you go - Slam-1
Highest accuracy transcription powered by LLM intelligence
- LLM-powered contextual understanding
- English only
- Beta access
Enterprise
Custom pricing for enterprise needs
- Custom rate limits and enhanced concurrency
- Enterprise-grade flexibility
- BAA for HIPAA compliance
- EU Data Residency standards
- Self-hosted deployments (On-prem, EU, VPC)
- Customized SLAs and SLOs
- 24/7 support and dedicated solution architects
Capabilities
Key Features
- Speech-to-Text for pre-recorded audio
- Streaming Speech-to-Text for real-time transcription
- Speaker diarization
- 99 language support
- Automatic language detection
- Sentiment analysis
- Entity detection
- Topic detection
- Auto chapters
- Summarization
- Speaker identification
- Translation
- Custom formatting
- PII redaction
- PII audio redaction
- Profanity filtering
- Content moderation
- LLM Gateway
- Word-level timestamps
- Keyterms prompting
- Custom spelling