EveryDev.ai
Sign inSubscribe
AI Tools by Topic
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Podcasts
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
  • Polls
Create
    Home
    Tools

    2,645+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1666
    • Coding1214
    • Infrastructure542
    • Marketing451
    • Design437
    • Projects396
    • Research371
    • Analytics339
    • Testing233
    • MCP227
    • Data213
    • Security200
    • Integration170
    • Learning155
    • Communication148
    • Prompts144
    • Extensions137
    • Commerce125
    • Voice122
    • DevOps99
    • Web78
    • Finance21
    1. Home
    2. Tools
    3. Miso TTS 8B
    Miso TTS 8B icon

    Miso TTS 8B

    Voice Synthesis

    An 8-billion parameter open-source text-to-speech model designed for high-quality, highly emotive conversational speech generation with voice cloning support.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Free to use under Modified MIT License. Run locally or access via Hugging Face.

    Engagement

    Available On

    CLI
    API
    SDK

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Voice SynthesisGenerative MediaLocal Inference

    Alternatives

    SupertonicDiaVibeVoice
    Developer
    Miso LabsMiso Labs (operating as Kamino Learning, Inc.) builds state-…

    Listed Jun 2026

    About Miso TTS 8B

    Miso TTS 8B is an open-source, 8-billion parameter text-to-dialogue model built by Miso Labs (Kamino Learning, Inc.) for high-quality conversational speech synthesis. The model is available on GitHub and Hugging Face, and can be run locally on CUDA-capable hardware. A live demo is hosted on the Miso Labs landing page at misolabs.ai.

    What It Is

    Miso TTS 8B is a text-to-speech model in the RVQ (Residual Vector Quantization) Transformer category, inspired by the Sesame CSM architecture. It generates Mimi audio codes from text and optional audio context, making it suitable for conversational speech generation rather than simple single-utterance synthesis. The model currently supports English only.

    Architecture

    The model uses two transformer components working in tandem:

    • Backbone transformer (Llama 8B): Consumes interleaved text and audio-frame embeddings, conditioning generation on conversation history.
    • Audio decoder transformer (Llama 300M): Autoregressively predicts higher-order audio codebooks within each frame.

    Key model specs include a text vocabulary of 128,256 tokens, an audio vocabulary of 2,051 tokens, 32 audio codebooks, the Mimi audio tokenizer, and a maximum sequence length of 2,048. Default inference uses torch.bfloat16 precision.

    Voice Cloning and Prompted Generation

    Miso TTS 8B supports optional prompted generation, allowing the model to condition on prior audio for voice cloning. Users supply a Segment object containing a speaker ID, transcript, and audio waveform as context. Without a prompt, the model generates speech from text alone. Generated audio is watermarked by default using the SilentCipher watermarking model from Sony.

    Setup Path

    The repository supports two installation paths:

    • uv (recommended): Clone the repo, run uv sync --python 3.10, activate the virtual environment, and run uv run python run_misotts.py.
    • pip: Create a Python 3.10 venv, install with pip install -e ., and run python run_misotts.py.

    Model weights are hosted publicly on Hugging Face at MisoLabs/MisoTTS and are downloaded automatically on first run via the Hugging Face Hub cache.

    Deployment Notes and Safety

    The model requires a CUDA GPU with sufficient VRAM for the checkpoint precision being loaded. The repository notes that Miso TTS 8B is a large model and recommends GPU inference for best results. The project's safety guidelines explicitly prohibit using the model to impersonate people, create deceptive audio, commit fraud, or generate harmful content. Deployers are advised to use their own private watermark key.

    Current Status

    The GitHub repository was created in May 2026 and last updated in early June 2026, with 1,662 stars and 134 forks as reported by the repository metadata. The project is released under a Modified MIT License, with a commercial attribution clause applying to products exceeding 50 million monthly active users or $10 million USD in monthly revenue.

    Community Discussions

    Be the first to start a conversation about Miso TTS 8B

    Share your experience with Miso TTS 8B, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Free to use under Modified MIT License. Run locally or access via Hugging Face.

    • Full model weights on Hugging Face
    • Local inference via Python
    • Voice cloning support
    • Audio watermarking
    • Commercial use allowed with attribution for large-scale deployments

    Capabilities

    Key Features

    • 8B parameter text-to-speech model
    • High-quality conversational speech generation
    • Voice cloning via prompted generation
    • RVQ Transformer architecture
    • Llama 8B backbone with Llama 300M audio decoder
    • Mimi audio tokenizer
    • 32 audio codebooks
    • SilentCipher audio watermarking
    • Hugging Face model hosting
    • Local inference support
    • Python API
    • English language support

    Integrations

    Hugging Face Hub
    PyTorch
    torchaudio
    SilentCipher (Sony)
    uv
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate Miso TTS 8B and help others make informed decisions.

    Developer

    Miso Labs

    Miso Labs (operating as Kamino Learning, Inc.) builds state-of-the-art speech AI models, with Miso TTS 8B as their flagship open-source text-to-speech system. The team publishes model weights publicly on Hugging Face and maintains an open-source inference codebase on GitHub. Miso Labs provides a live demo on their website and actively develops conversational speech generation technology.

    Read more about Miso Labs
    WebsiteGitHubX / Twitter
    1 tool in directory

    Similar Tools

    Supertonic icon

    Supertonic

    Lightning-fast, on-device text-to-speech system powered by ONNX Runtime that runs entirely locally with no cloud dependency, supporting 31 languages across Python, JavaScript, mobile, and native runtimes.

    Dia icon

    Dia

    Dia is an open-source text-to-speech model by Nari Labs that generates realistic dialogue audio with multiple speakers, emotions, and non-verbal sounds from transcripts.

    VibeVoice icon

    VibeVoice

    An open-source family of frontier voice AI models from Microsoft, including long-form TTS, multi-speaker speech synthesis, real-time streaming TTS, and long-form ASR with speaker diarization.

    Browse all tools

    Related Topics

    Voice Synthesis

    AI tools that generate human-like speech from text.

    26 tools

    Generative Media

    AI platforms providing comprehensive generative capabilities across multiple media types including images, video, audio, and 3D content.

    93 tools

    Local Inference

    Tools and platforms for running AI inference locally without cloud dependence.

    117 tools
    Browse all topics
    Back to all tools
    Discussions