Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    EveryDev.ai
    Sign inSubscribe
    Home
    Tools

    2,170+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1228
    • Coding1045
    • Infrastructure455
    • Marketing414
    • Design374
    • Projects340
    • Analytics319
    • Research306
    • Testing200
    • Data171
    • Integration169
    • Security169
    • MCP164
    • Learning146
    • Communication131
    • Prompts122
    • Extensions120
    • Commerce116
    • Voice107
    • DevOps92
    • Web73
    • Finance19
    1. Home
    2. Tools
    3. Agent Desktop
    Agent Desktop icon

    Agent Desktop

    Browser Automation

    A native desktop automation CLI for AI agents built in Rust that controls any application through OS accessibility trees with structured JSON output and deterministic element refs.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully free and open-source under the Apache License 2.0. Use, modify, and distribute freely.

    Engagement

    Available On

    Windows
    macOS
    Linux
    API
    VS Code

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Browser AutomationAutonomous SystemsAgent Frameworks

    Alternatives

    Page AgentBrowser UseCua
    Developer
    lahfirlahfir builds native desktop automation tooling for AI agent…

    Listed May 2026

    About Agent Desktop

    agent-desktop is a native desktop automation CLI built in Rust, designed specifically for AI agents to observe, decide, and act on any desktop application. It provides structured access to applications through OS accessibility trees — no screenshots, no pixel matching, no browser required. The tool outputs machine-readable JSON with deterministic element references, making it ideal for agentic workflows that require reliable, repeatable UI interactions.

    • Native Rust CLI: Fast, single binary with no runtime dependencies — install via npm install -g agent-desktop or build from source with Cargo.
    • 53 commands: Covers observation, interaction, keyboard, mouse, notifications, clipboard, and window management for comprehensive desktop control.
    • Progressive skeleton traversal: Achieves 78–96% token reduction on dense apps via shallow overview and targeted drill-down, minimizing LLM context usage.
    • Snapshot & refs system: AI-optimized workflow using deterministic element references (@e1, @e2) that persist until the next snapshot, enabling reliable act-verify loops.
    • AX-first interactions: Every action exhausts pure accessibility API strategies before falling back to mouse events, maximizing reliability.
    • Structured JSON output: All commands return machine-readable responses with error codes and recovery hints for robust agent error handling.
    • C-ABI cdylib (FFI): Load libagent_desktop_ffi once from Python, Swift, Go, Ruby, Node, or C instead of forking the CLI per call — prebuilt binaries ship with every release.
    • Works with any app: Finder, Safari, System Settings, Xcode, Slack — anything with an OS accessibility tree is supported.
    • Batch command execution: Run multiple commands in a single call with --stop-on-error support for efficient multi-step agent workflows.
    • Cross-platform FFI binaries: Prebuilt cdylib artifacts available for macOS arm64/x86_64, Linux x86_64/arm64, and Windows x86_64.
    Agent Desktop - 1

    Community Discussions

    Be the first to start a conversation about Agent Desktop

    Share your experience with Agent Desktop, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully free and open-source under the Apache License 2.0. Use, modify, and distribute freely.

    • All 53 commands
    • Native Rust CLI binary
    • C-ABI cdylib FFI library
    • Progressive skeleton traversal
    • Structured JSON output

    Capabilities

    Key Features

    • Native Rust CLI — single binary, no runtime dependencies
    • 53 commands covering observation, interaction, keyboard, mouse, notifications, clipboard, and window management
    • Progressive skeleton traversal with 78–96% token reduction
    • Deterministic element refs (@e1, @e2) via snapshot system
    • AX-first interaction strategy before mouse fallback
    • Structured JSON output with error codes and recovery hints
    • C-ABI cdylib (libagent_desktop_ffi) for in-process FFI from Python, Swift, Go, Ruby, Node, C
    • Batch command execution with stop-on-error support
    • Accessibility tree traversal — no screenshots or pixel matching
    • App and window management (launch, close, resize, move, minimize, maximize)
    • Clipboard read/write/clear
    • Notification listing and dismissal (macOS)
    • Wait commands with element, window, text, and menu conditions
    • Prebuilt binaries for macOS arm64/x86_64, Linux x86_64/arm64, Windows x86_64

    Integrations

    Python (via ctypes/FFI)
    Swift
    Go
    Ruby
    Node.js
    C/C++
    npm
    Cargo
    MCP (Model Context Protocol)
    Slack
    Safari
    Finder
    Xcode
    VS Code
    Notion
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate Agent Desktop and help others make informed decisions.

    Developer

    lahfir

    lahfir builds native desktop automation tooling for AI agents, with agent-desktop as the flagship project. The project is written in Rust and focuses on accessibility-tree-based control of desktop applications without screenshots or pixel matching. It ships prebuilt binaries and C-ABI FFI libraries for broad language compatibility.

    Read more about lahfir
    WebsiteGitHub
    1 tool in directory

    Similar Tools

    Page Agent icon

    Page Agent

    Page Agent is an open-source browser automation framework by Alibaba that enables AI agents to interact with web pages using natural language instructions.

    Browser Use icon

    Browser Use

    Browser Use is an AI-powered browser automation platform that lets agents extract data, automate tasks, and interact with any website at scale using natural language.

    Cua icon

    Cua

    Cua is a computer use agent platform that lets you build AI agents capable of seeing screens, clicking buttons, typing, and running code across macOS, Windows, and Linux sandboxes.

    Browse all tools

    Related Topics

    Browser Automation

    AI-powered agents that autonomously navigate and interact with web applications to automate repetitive tasks, extract data, fill forms, and perform web-based workflows using intelligent understanding of page structure and content.

    58 tools

    Autonomous Systems

    AI agents that can perform complex tasks with minimal human guidance.

    176 tools

    Agent Frameworks

    Tools and platforms for building and deploying custom AI agents.

    246 tools
    Browse all topics
    Back to all tools
    Explore AI Tools
    • AI Coding Assistants
    • Agent Frameworks
    • MCP Servers
    • AI Prompt Tools
    • Vibe Coding Tools
    • AI Design Tools
    • AI Database Tools
    • AI Website Builders
    • AI Testing Tools
    • LLM Evaluations
    Follow Us
    • X / Twitter
    • LinkedIn
    • Reddit
    • Discord
    • Threads
    • Bluesky
    • Mastodon
    • YouTube
    • GitHub
    • Instagram
    Get Started
    • About
    • Editorial Standards
    • Corrections & Disclosures
    • Community Guidelines
    • Advertise
    • Contact Us
    • Newsletter
    • Submit a Tool
    • Start a Discussion
    • Write A Blog
    • Share A Build
    • Terms of Service
    • Privacy Policy
    Explore with AI
    • ChatGPT
    • Gemini
    • Claude
    • Grok
    • Perplexity
    Agent Experience
    • llms.txt
    Theme
    With AI, Everyone is a Dev. EveryDev.ai © 2026
    Discussions