# Agent TARS CLI

> Command-line interface for running the Agent TARS multimodal agent locally, with optional Web UI, model providers, and a typed workspace config.

Agent TARS CLI is the terminal interface for the Agent TARS stack — a multimodal agent that can browse, run commands, use MCP tools, and coordinate GUI/browser actions. It supports multiple model providers (Volcano Engine Seed-1.5-VL, Anthropic Claude 3.7 Sonnet, OpenAI GPT-4o), can pop open a local Web UI to inspect runs, and lets you manage a **typed** workspace config in TypeScript.

### Installation
- **Prerequisites**
  - **Node.js ≥ 22** (LTS recommended)
  - **Google Chrome** (the CLI controls your *local* browser)
- **Install**
  ```bash
  # latest
  npm install -g @agent-tars/cli@latest
  
  # or install the current beta
  npm install -g @agent-tars/cli@next
  ```

### Usage
1) **Pick a model provider & run**
```bash
# Volcano Engine (Seed 1.5 VL / Doubao)
agent-tars \
  --provider volcengine \
  --model doubao-1-5-thinking-vision-pro-250428 \
  --apiKey "$VOLC_API_KEY"

# Anthropic
agent-tars \
  --provider anthropic \
  --model claude-3-7-sonnet-latest \
  --apiKey "$ANTHROPIC_API_KEY"

# OpenAI
agent-tars \
  --provider openai \
  --model gpt-4o \
  --apiKey "$OPENAI_API_KEY"
```
When the CLI starts it prints a local link (e.g. `http://localhost:8888`) — open it to view the **Web UI** for the current run.

2) **Create a global workspace** (recommended)
```bash
agent-tars workspace --init   # guided setup
agent-tars workspace --open   # open the folder
```
Then manage config in TypeScript with types:
```ts
// agent-tars.config.ts
import { defineConfig } from '@agent-tars/interface';
export default defineConfig({
  model: { provider: 'volcengine' },
  // ...other settings
});
```

3) **Start a task**
```bash
agent-tars --provider openai --model gpt-4o --apiKey "$OPENAI_API_KEY"
# Then in the Web UI, enter a prompt like:
# "Tell me the top 10 for Humanity's Last Exam"
```

**Notes**
- Designed for headless CLI use; you can summon the Web UI on demand to inspect/steer runs.
- Integrates with browser, shell commands, file system, and MCP tools; includes built-in tools like Search/Browser/File/Command.

## Features
- Multimodal agent runs from the terminal with optional Web UI viewer
- Model provider plug-ins (Volcano Engine Seed-1.5-VL, Anthropic, OpenAI)
- Local browser automation via Chrome
- Typed workspace configuration in TypeScript
- Built-in tools (Search, Browser, File, Command) plus MCP integration
- Headless operation with on-demand UI
- Supports visual grounding and tool calls with compatible models

## Integrations
Google Chrome, Volcano Engine (Doubao / Seed-1.5-VL), Anthropic Claude, OpenAI GPT-4o, MCP tools

## Platforms
WINDOWS, MACOS, LINUX

## Pricing
Open Source

## Version
0.2.10

## Links
- Website: https://agent-tars.com/
- Documentation: https://agent-tars.com/guide/get-started/quick-start.html
- Repository: https://github.com/bytedance/UI-TARS-desktop
- EveryDev.ai: https://www.everydev.ai/tools/agent-tars-cli