UFO³ is an open-source multi-device GUI agent framework by Microsoft that orchestrates intelligent agents across Windows, Linux, and Android using DAG-based task planning.
At a Glance
Fully free and open-source under the MIT License. Self-hosted via GitHub.
Engagement
Available On
Alternatives
Listed Jun 2026
About UFO
UFO³ is a Microsoft Research open-source project that has evolved from a single Windows GUI agent into a full multi-device orchestration framework. Released under the MIT License, it coordinates intelligent agents across heterogeneous platforms—Windows, Linux, and Android—using a declarative DAG-based task model. The project has accumulated over 9,000 GitHub stars since its initial release in February 2024.
What It Is
UFO³ is a GUI automation agent framework that lets users describe complex tasks in natural language and have them executed automatically across one or more devices. It ships in two tightly integrated components: UFO², a stable Desktop AgentOS for single-device Windows automation, and Galaxy, a newer multi-device orchestration layer that decomposes requests into executable directed acyclic graphs (DAGs) and dispatches subtasks to capable device agents in parallel. Both components are written in Python and require an LLM API key (OpenAI, Azure OpenAI, Qwen, Gemini, Claude, and others are supported).
Architecture: UFO² and Galaxy
The project's README describes a two-tier architecture:
- UFO² (Desktop AgentOS) — the stable, long-term-support layer. It integrates deeply with Windows via UIA, Win32, and WinCOM APIs, supports hybrid GUI-click plus API-call actions, and uses speculative multi-action batching that the documentation claims reduces LLM calls by 51%. UFO² can run standalone or serve as a Galaxy device agent for Windows.
- Galaxy (Multi-Device Orchestration) — the newer active-development layer. A
ConstellationAgentdecomposes user requests into aTaskConstellationDAG ofTaskStarnodes with dependencies. ATaskOrchestratorschedules and executes nodes asynchronously, matching tasks to devices by capability. Agents communicate over a WebSocket-based Unified Agent Interaction Protocol (AIP) with fault tolerance and automatic reconnection.
Key Capabilities
- Declarative DAG decomposition — requests become structured graphs with explicit dependencies, enabling automated scheduling and runtime rewriting
- Dynamic graph evolution — the constellation adapts to execution feedback through controlled rewrites rather than rigid pre-planned sequences
- Heterogeneous async orchestration — capability-based device matching with safe locking and formally verified concurrency correctness
- MCP integration — Model Context Protocol support for tool augmentation in device agents
- RAG knowledge substrate — retrieval-augmented generation over documentation, demos, and execution traces for UFO²
- Visual + UIA hybrid detection — combines screenshot-based and accessibility-tree-based control detection for robustness
Setup Path
Both frameworks are installed via pip install -r requirements.txt from the GitHub repository. Configuration requires editing YAML files to supply LLM API keys and, for Galaxy, registering device agents in a devices.yaml pool. The README provides separate quick-start paths for Galaxy (cross-device) and UFO² (Windows-only), with platform-specific guides for Windows, Linux, and Android device agents.
Update: UFO³ Galaxy and Version 3.0.7
The project's evolution timeline spans three generations: the original UFO GUI agent (February 2024), UFO² Desktop AgentOS (April 2025, now in LTS), and UFO³ Galaxy (November 2025). The latest GitHub release is version 3.0.7, published June 12, 2026. UFO² has entered Long-Term Support status with ongoing bug fixes and security updates. Galaxy is marked as active development, recommended for experimentation and non-critical workflows. Two research papers accompany the releases: arXiv:2504.14603 for UFO² and arXiv:2511.11332 for UFO³ Galaxy.
Why It Got Attention
The original UFO release in February 2024 received wide media coverage for applying multimodal LLMs directly to Windows GUI automation. UFO² extended this into an "AgentOS" concept with deeper OS integration. UFO³ Galaxy represents a research-level step toward coordinating fleets of heterogeneous device agents—the README describes it as the first multi-device orchestration framework for GUI agents—positioning it within the broader multi-agent systems research landscape alongside related Microsoft projects like TaskWeaver.
Community Discussions
Be the first to start a conversation about UFO
Share your experience with UFO, ask questions, or help others learn from your insights.
Pricing
Open Source
Fully free and open-source under the MIT License. Self-hosted via GitHub.
- UFO² Desktop AgentOS for Windows
- Galaxy multi-device orchestration framework
- Full source code access
- MIT License — free to use, modify, and distribute
- Community support via GitHub Discussions and Issues
Capabilities
Key Features
- Multi-device DAG-based task orchestration
- UFO² Desktop AgentOS for Windows automation
- Galaxy multi-device orchestration framework
- Declarative task decomposition into TaskConstellation DAGs
- Dynamic graph evolution with runtime rewriting
- Asynchronous parallel task execution
- Capability-based device matching and assignment
- Unified Agent Interaction Protocol (AIP) over WebSocket
- Model Context Protocol (MCP) integration
- Hybrid GUI + API action execution
- Speculative multi-action batching
- Visual + UIA hybrid control detection
- RAG knowledge substrate with docs and execution traces
- Support for OpenAI, Azure OpenAI, Qwen, Gemini, Claude
- Windows UIA, Win32, WinCOM native integration
- Android and Linux device agent support
- Real-time status monitoring and visualization
- Fault tolerance and automatic reconnection
Integrations
Demo Video

