A pioneering realistic benchmark for web agents, focusing on functional correctness and high-authenticity environments rather than just text-based interactions.

Leadership

Founders

Shuyan Zhou

Assistant Professor at Duke University (since 2024); PhD from Carnegie Mellon University; previously researcher at Google[x] and Microsoft Research. Lead contributor to WebArena.

Frank F. Xu

Researcher at Carnegie Mellon University focusing on code generation and autonomous agents. Lead developer and contributor to WebArena and TheAgentCompany.

Graham Neubig

Associate Professor at Carnegie Mellon University; Co-Founder of Inspired Cognition and All Hands AI. Principal investigator for the WebArena project.

Executive Team

Shuyan Zhou

Project Lead / Assistant Professor (Duke)

Specializes in NLP and autonomous agents.

Frank F. Xu

Lead Developer / Researcher (CMU)

Expert in machine learning and software engineering.

Board of Directors

Daniel Fried

Faculty Advisor (Carnegie Mellon University)

Yonatan Bisk

Faculty Advisor (Carnegie Mellon University)

Ruslan Salakhutdinov

Advisor (VisualWebArena)

Founding Story

WebArena was created to move beyond toy benchmarks and provide a realistic end-to-end environment where agents must interact with complex websites and tools, mimicking human problem-solving workflows.

Business Model

Revenue Model

Open-source research project; supported by academic research grants from CMU, Duke, and affiliated organizations.

Not applicable (Research Project)

Target Markets

Industries & Segments

AI Research Labs
Technology Companies developing AI Agents
Open Source AI Community

Use Cases

Benchmarking autonomous web agents
Training LLMs for web-based computer use
Research in multimodal AI perception and reasoning
Evaluating agent safety and reliability

Notable Customers

Anthropic
OpenAI
Meta
Microsoft

Quick Facts

Headquarters

Pittsburgh, PA

Founded

2023

Entity Type

Academic Research Collective / Open Source Organization

Employees

Total Funding

Not disclosed (primarily supported via academic research grants)

Investors

Carnegie Mellon University, Duke University

Office Locations

Carnegie Mellon University

Duke University

History & Milestones

May 2025

TheAgentCompany benchmark presented at ICML 2025.

May 2024

VisualWebArena (multimodal benchmark) presented at ACL 2024.

December 2024

WebArena presented as an Oral paper at NeurIPS 2024.

November 2024

WebArena-Infinity announced for automated environment generation.

July 2023

WebArena paper first released on arXiv, introducing the realistic web benchmark.

Key Capabilities

Self-hostable sandboxed web environments

Programmatic verification of functional correctness

Diverse task categories (E-commerce, Social, Productivity, Maps)

Multimodal (text + image) input support

Automated task and environment generation

Integrations & Partnerships

Platform Integrations

Docker
GitHub
Hugging Face
arXiv

Key Partnerships

Carnegie Mellon University

Duke University

Connect

Website

webarena.dev

GitHub

web-arena-x

AI Topics

web-arena-x focuses on these topics:

Agent Harness(1)

Browser Automation(1)

LLM Evaluations(1)

Back to all developers Suggest an edit

web-arena-x

To build realistic, reproducible web environments for training and evaluating autonomous web agents that can handle complex, real-world tasks.

Visit Website

At a Glance

10Tool Views

Pittsburgh, PAHeadquarters

2023Est.

15Employees

AI Tools by web-arena-x

(1)

WebArena

Web Agent Benchmark Environment

Agent Harness Browser Automation LLM Evaluations

Discussions

No discussions yet

Be the first to start a discussion about web-arena-x

Latest News

05/01/2025

TheAgentCompany presented at ICML 2025

the-agent-company.com

12/01/2024

WebArena presented as Oral at NeurIPS 2024

webarena.dev

08/01/2024

VisualWebArena presented at ACL 2024

jykoh.com

11/01/2024

WebArena-Infinity announced

webarena.dev

Products & Services

WebArena

2023

A standalone, self-hostable web environment with four popular categories (Shopping, Reddit, GitLab, etc.) for building autonomous agents.

VisualWebArena

2024

A benchmark designed to assess the performance of multimodal web agents on realistic visual web tasks.

WebArena-Infinity

2024

A framework for automatically generating browser environments with verifiable tasks and high authenticity.

TheAgentCompany

2025

An extensible benchmark for evaluating AI agents on professional tasks within a simulated company environment.

Market Position

A pioneering realistic benchmark for web agents, focusing on functional correctness and high-authenticity environments rather than just text-based interactions.

Leadership

Founders

Shuyan Zhou

Assistant Professor at Duke University (since 2024); PhD from Carnegie Mellon University; previously researcher at Google[x] and Microsoft Research. Lead contributor to WebArena.

Frank F. Xu

Researcher at Carnegie Mellon University focusing on code generation and autonomous agents. Lead developer and contributor to WebArena and TheAgentCompany.

Graham Neubig

Associate Professor at Carnegie Mellon University; Co-Founder of Inspired Cognition and All Hands AI. Principal investigator for the WebArena project.

Executive Team

Shuyan Zhou

Project Lead / Assistant Professor (Duke)

Specializes in NLP and autonomous agents.

Frank F. Xu

Lead Developer / Researcher (CMU)

Expert in machine learning and software engineering.

Board of Directors

Daniel Fried

Faculty Advisor (Carnegie Mellon University)

Yonatan Bisk

Faculty Advisor (Carnegie Mellon University)

Ruslan Salakhutdinov

Advisor (VisualWebArena)

Founding Story

Business Model

Revenue Model

Open-source research project; supported by academic research grants from CMU, Duke, and affiliated organizations.

Not applicable (Research Project)

Target Markets

Industries & Segments

AI Research Labs
Technology Companies developing AI Agents
Open Source AI Community

Use Cases

Benchmarking autonomous web agents
Training LLMs for web-based computer use
Research in multimodal AI perception and reasoning
Evaluating agent safety and reliability

Notable Customers

Anthropic
OpenAI
Meta
Microsoft

Quick Facts

Headquarters

Pittsburgh, PA

Founded

2023

Entity Type

Academic Research Collective / Open Source Organization

Employees

Total Funding

Not disclosed (primarily supported via academic research grants)

Investors

Carnegie Mellon University, Duke University

Office Locations

Carnegie Mellon University

Duke University

History & Milestones

May 2025

TheAgentCompany benchmark presented at ICML 2025.

May 2024

VisualWebArena (multimodal benchmark) presented at ACL 2024.

December 2024

WebArena presented as an Oral paper at NeurIPS 2024.

November 2024

WebArena-Infinity announced for automated environment generation.

July 2023

WebArena paper first released on arXiv, introducing the realistic web benchmark.

Key Capabilities

Self-hostable sandboxed web environments

Programmatic verification of functional correctness

Diverse task categories (E-commerce, Social, Productivity, Maps)

Multimodal (text + image) input support

Automated task and environment generation

Integrations & Partnerships

Platform Integrations

Docker
GitHub
Hugging Face
arXiv

Key Partnerships

Carnegie Mellon University

Duke University

Connect

Website

webarena.dev

GitHub

web-arena-x

AI Topics

web-arena-x focuses on these topics:

Agent Harness(1)

Browser Automation(1)

LLM Evaluations(1)

Back to all developers Suggest an edit