Sotto
Sotto is a native macOS voice-to-text app that uses local AI (Whisper) to transcribe speech and instantly paste text into any app, with no subscription required.
At a Glance
About Sotto
Sotto is a native macOS voice-to-text dictation tool built by indie developer Kitze. It uses a global hotkey push-to-talk model to capture speech and instantly insert transcribed text into any active application — no clipboard, no copy-paste. The app is sold as a one-time purchase and runs entirely on-device by default, with optional cloud model support.
What It Is
Sotto is a system-wide dictation layer for macOS. Users hold or tap a customizable hotkey to start recording, speak naturally, and the transcribed text is auto-pasted directly into whatever app is in focus — Xcode, Slack, Notes, a browser text field, or anything else. The core transcription engine is WhisperKit, running locally on Apple's Neural Engine, so audio never leaves the device unless the user opts into a cloud provider.
Local-First Architecture
The default setup runs entirely offline using WhisperKit models ranging from Tiny (~66 MB) to Large V3 Turbo (~954 MB) and Distil Large V3 (~800 MB). Sotto also supports NVIDIA Parakeet models (v2 English at 2.6 GB and v3 Multilingual at 2.7 GB) for state-of-the-art accuracy. For users who want maximum accuracy or have strong accents, optional cloud backends include OpenAI (gpt-4o-mini-transcribe) and Groq (Whisper Large V3 Turbo), billed at approximately $0.006/minute through the user's own API key.
Key Capabilities
- Push-to-talk or toggle mode — hold to speak and release to insert, or tap once to start/stop
- Auto-paste — transcripts appear directly in the active app without manual copying
- Recording history — every session saved locally with timestamps, full-text search, and audio playback
- Re-transcribe — switch models after the fact to improve accuracy on any saved recording
- Audio file import — drag and drop voice memos, meeting recordings, or podcast episodes for transcription
- Custom vocabulary — add technical terms, names, and jargon as hints for better accuracy
- Always-on rules — automatic grammar fixing, filler word removal, smart punctuation, and professional tone rewriting applied to every transcription
- AI Functions — one-click post-processing via OpenAI, Anthropic, Google Gemini, Groq, Mistral, or Grok to transform dictation into emails, coding prompts, summaries, translations, or custom outputs
- 90+ language support — per-language hotkeys and auto-detect mode for multilingual use
Platform and System Requirements
Sotto is a native Swift and SwiftUI app built exclusively for macOS 13 (Ventura) and later. It uses Apple's Neural Engine for fast local inference, making it lightweight relative to the model sizes. A license key is delivered by email and covers up to 3 Mac activations.
Why It Stands Out
The homepage explicitly positions Sotto against subscription-based Whisper wrappers, arguing that paying a recurring fee for a model that is freely available is unnecessary. The one-time purchase model with lifetime updates and 3-device coverage is the central differentiator the product page emphasizes. The developer, Kitze, is also known for Sizzy (a developer browser), Benji, and other indie macOS and web tools.
Community Discussions
Be the first to start a conversation about Sotto
Share your experience with Sotto, ask questions, or help others learn from your insights.
Pricing
Sotto
One-time purchase for lifetime access on up to 3 Macs with all future updates included.
- Lifetime license — pay once, own forever
- Use on up to 3 Macs
- All future updates included
- Local Whisper models (Tiny to Medium)
- Cloud transcription (OpenAI, Groq)
- Import & transcribe any audio file
- Recording history with re-transcribe
- Custom vocabulary dictionary
- Auto-paste & auto-copy
Capabilities
Key Features
- Push-to-talk and toggle dictation modes
- Global hotkey system-wide operation
- Auto-paste into any active app
- 100% local transcription via WhisperKit on Apple Neural Engine
- NVIDIA Parakeet model support
- Optional cloud transcription via OpenAI and Groq
- Recording history with timestamps and full-text search
- Re-transcribe recordings with a different model
- Audio file import (mp3, m4a, wav, webm)
- Custom vocabulary dictionary
- Always-on text cleanup rules (grammar, filler words, punctuation, tone)
- AI Functions for post-processing (email, coding prompts, summaries, translation)
- Support for OpenAI, Anthropic, Google Gemini, Groq, Mistral, Grok
- 90+ language support
- Per-language hotkeys
- Auto-detect language mode
- Native Swift and SwiftUI app
- Lifetime license with 3 Mac activations
