Sutando
An open-source, self-hosted AI agent for macOS that uses voice, vision, and autonomous action to control your computer, join meetings, make phone calls, and build itself.
At a Glance
Fully free and open-source under the MIT License. Requires a Claude Code subscription and a Gemini API key (free tier available).
Engagement
Available On
Alternatives
Listed May 2026
About Sutando
Sutando is an open-source, self-hosted AI personal agent for macOS that combines voice control, screen vision, and autonomous task execution into a single local system. It runs on your existing Claude Code subscription and a free Gemini API key, with no remote control plane or third-party write access. Sutando can see your screen, join your meetings, make phone calls, send messages, and autonomously improve its own capabilities when idle — all from your Mac.
- Voice control — Connect via browser or phone; say commands like "what's on my screen?" or "fix the typo in that file" and Sutando acts immediately using Gemini Live real-time voice.
- Screen vision — Sutando captures and analyzes your screen on demand, enabling context-aware assistance without manual copy-paste.
- Meeting participation — Say "join my 2pm call" and Sutando reads your calendar, joins Zoom or Google Meet with computer audio, researches questions live, and writes a summary when done.
- Phone calls — Sutando can make and receive calls via Twilio, have conversations on your behalf, and report back while you keep working.
- Multi-channel messaging — Reach the same agent via voice, Telegram, Discord, web, phone, or email — all sharing the same memory and task queue.
- Autonomous build loop — When idle, Sutando monitors its own health, detects usage patterns, discovers new skills, and writes missing capabilities — most of its own code was built this way.
- Notes and memory — Capture ideas by voice; Sutando tags, saves, and searches them as YAML-frontmatter markdown notes and acts on actionable items automatically.
- Multi-machine scaling — Plug in a second Mac and Sutando migrates services autonomously via Discord, coordinating the handoff between agents without migration scripts.
- 3-tier access control — Owner, verified, and unverified callers get different capability bands on phone, Discord, and Telegram, with STIR/SHAKEN caller ID verification for inbound calls.
- Quick start — Clone the repo, add your
GEMINI_API_KEYto.env, and runbash src/startup.sh; a menu bar app, dashboard, and voice interface launch automatically.
Community Discussions
Be the first to start a conversation about Sutando
Share your experience with Sutando, ask questions, or help others learn from your insights.
Pricing
Open Source (MIT)
Fully free and open-source under the MIT License. Requires a Claude Code subscription and a Gemini API key (free tier available).
- Voice control via browser
- Screen capture and vision
- Autonomous task execution
- Notes and memory
- Multi-channel messaging (Telegram, Discord)
Capabilities
Key Features
- Voice control via browser or phone
- Screen capture and vision analysis
- Autonomous meeting joining (Zoom, Google Meet)
- Outbound and inbound phone calls via Twilio
- Multi-channel messaging (Telegram, Discord, email, web)
- Autonomous build loop and self-improvement
- Voice-driven note capture and search
- Multi-machine agent scaling and migration
- 3-tier access control (owner/verified/unverified)
- Global keyboard shortcuts via macOS menu bar app
- Proactive health monitoring and auto-repair
- Pattern detection and user modeling
- Gmail read/send/search
- Calendar and reminders integration
- Browser automation via MCP tools
- Cross-node memory and notes sync
- Info-radar (arXiv, GitHub, HN, news monitoring)
- System dashboard at localhost:7844
Integrations
Demo Video

