EveryDev.ai
Subscribe
Home
Tools

2,885+ AI tools

  • New
  • Trending
  • Featured
  • Compare
  • Arena
Categories
  • Agents1815
  • Coding1295
  • Infrastructure600
  • Marketing467
  • Projects433
  • Research403
  • Analytics351
  • Design338
  • Security243
  • MCP242
  • Testing238
  • Data230
  • Integration178
  • Prompts160
  • Learning159
  • Communication154
  • Extensions150
  • Voice130
  • Commerce125
  • DevOps108
  • Web80
  • Finance21
AI Tools by Topic
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
    1. Home
    2. Tools
    3. BrowserGym
    BrowserGym icon

    BrowserGym

    Browser Automation

    An open-source Gym environment for web task automation, enabling researchers to build, test, and benchmark web agents across multiple standardized benchmarks.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully free and open-source under the Apache 2.0 license. Install via pip and use all benchmarks at no cost.

    Engagement

    Available On

    iOS
    API
    SDK
    CLI

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Browser AutomationAgent FrameworksLLM Evaluations

    Alternatives

    FirecrawlTuriX CUACua
    Developer
    ServiceNowSanta Clara, CAEst. 2004$83.7M raised

    Listed Jun 2026

    About BrowserGym

    BrowserGym is an open-source framework developed by ServiceNow Research that provides a standardized Gym-compatible environment for web task automation and web agent research. Built on top of Playwright and the Gymnasium interface, it lets researchers implement agents that interact with real browsers and evaluate them across a growing suite of benchmarks. The project is published under the Apache License 2.0 and is explicitly positioned as a research tool rather than a consumer product.

    What It Is

    BrowserGym wraps browser interactions into the familiar gym.make / env.step loop from reinforcement learning, making it straightforward to plug in any LLM-based or rule-based agent. Each task exposes observations (DOM, screenshots, accessibility trees) and accepts actions (clicks, typing, navigation), with rewards computed by benchmark-specific evaluators. The framework is designed to be extensible: new benchmarks can be added by subclassing AbstractBrowserTask.

    Included Benchmarks

    BrowserGym ships with integrations for a wide range of web agent benchmarks out of the box:

    • MiniWoB – over 100 synthetic web tasks via the Farama Foundation
    • WebArena and WebArenaVerified – realistic tasks on self-hosted web domains
    • VisualWebArena – visual variants of WebArena tasks
    • WorkArena / WorkArena++ – tasks on the ServiceNow platform
    • AssistantBench – time-consuming open-web research tasks
    • WebLINX – a static dataset of real-world web interaction traces
    • OpenApps – Facebook Research's open application benchmark
    • TimeWarp – a temporal web task benchmark

    Architecture and Setup Path

    Installation is modular via PyPI. The full stack installs with pip install browsergym, while individual benchmark packages (e.g., browsergym-webarena, browsergym-miniwob) can be installed separately to keep dependencies lean. After installation, Playwright's Chromium browser is set up with playwright install chromium. Each benchmark then has its own additional setup steps documented in per-benchmark READMEs.

    A companion framework, AgentLab, is maintained alongside BrowserGym and provides higher-level utilities for running agents at scale, collecting traces, and analyzing results across all BrowserGym benchmarks.

    Research Lineage and Publication

    The framework is described in a peer-reviewed paper — "The BrowserGym Ecosystem for Web Agent Research" — published in Transactions on Machine Learning Research (2025) with Expert Certification. The WorkArena benchmark was presented at ICML 2024. Experiment traces from the paper are publicly available on Hugging Face. The project has accumulated over 1,200 GitHub stars and 177 forks as of mid-2025, according to the repository metadata.

    Update: v0.14.3

    The latest release is v0.14.3, published on January 20, 2026. The repository remains actively maintained, with the last push recorded in March 2026. Recent additions to the benchmark suite include OpenApps and TimeWarp, signaling continued expansion of the supported task environments. The project's CI pipeline enforces code formatting and unit tests on every push.

    BrowserGym - 1

    Community Discussions

    Be the first to start a conversation about BrowserGym

    Share your experience with BrowserGym, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully free and open-source under the Apache 2.0 license. Install via pip and use all benchmarks at no cost.

    • Full framework access
    • All benchmark integrations
    • Extensible task API
    • AgentLab compatibility
    • Apache 2.0 license

    Capabilities

    Key Features

    • Gym-compatible browser environment for web agents
    • Support for MiniWoB, WebArena, VisualWebArena, WorkArena, AssistantBench, WebLINX, OpenApps, TimeWarp benchmarks
    • Modular pip installation per benchmark
    • Playwright-based Chromium browser automation
    • Extensible AbstractBrowserTask base class for custom benchmarks
    • DOM, screenshot, and accessibility tree observations
    • Demo agent with OpenAI backend
    • Integration with AgentLab for large-scale agent evaluation
    • Open-ended interactive chat task mode
    • Apache 2.0 open-source license

    Integrations

    Playwright
    Gymnasium (gym)
    OpenAI API
    AgentLab
    MiniWoB
    WebArena
    VisualWebArena
    WorkArena
    AssistantBench
    WebLINX
    OpenApps
    TimeWarp
    Hugging Face
    API Available
    View Docs

    Ratings & Reviews

    No ratings yet

    Be the first to rate BrowserGym and help others make informed decisions.

    Developer

    ServiceNow

    ServiceNow builds cloud-based workflow automation and AI platforms for enterprise IT, HR, and operations. The company's research division, ServiceNow Research, develops open-source tools and benchmarks for AI agent research, including BrowserGym and WorkArena. ServiceNow Research collaborates with academic institutions and publishes work at top ML venues including ICML and TMLR.

    Founded 2004
    Santa Clara, CA
    $83.7M raised
    25,000 employees

    Used by

    Disney
    Siemens
    Standard Chartered
    U.S. Department of Veterans Affairs
    +1 more
    Read more about ServiceNow
    WebsiteGitHubLinkedIn
    1 tool in directory

    Similar Tools

    Firecrawl icon

    Firecrawl

    An open-source API to search, scrape, crawl, and interact with the web, converting any website into clean, LLM-ready markdown or structured JSON for AI agents and applications.

    TuriX CUA icon

    TuriX CUA

    TuriX is an open-source computer-use agent (CUA) that lets AI models take real desktop actions on macOS, Windows, and Linux using natural language commands.

    Cua icon

    Cua

    Cua is a computer use agent platform that lets you build AI agents capable of seeing screens, clicking buttons, typing, and running code across macOS, Windows, and Linux sandboxes.

    Browse all tools

    Related Topics

    Browser Automation

    AI-powered agents that autonomously navigate and interact with web applications to automate repetitive tasks, extract data, fill forms, and perform web-based workflows using intelligent understanding of page structure and content.

    95 tools

    Agent Frameworks

    Tools and platforms for building and deploying custom AI agents.

    439 tools

    LLM Evaluations

    Platforms and frameworks for evaluating, testing, and benchmarking LLM systems and AI applications. These tools provide evaluators and evaluation models to score AI outputs, measure hallucinations, assess RAG quality, detect failures, and optimize model performance. Features include automated testing with LLM-as-a-judge metrics, component-level evaluation with tracing, regression testing in CI/CD pipelines, custom evaluator creation, dataset curation, and real-time monitoring of production systems. Teams use these solutions to validate prompt effectiveness, compare models side-by-side, ensure answer correctness and relevance, identify bias and toxicity, prevent PII leakage, and continuously improve AI product quality through experiments, benchmarks, and performance analytics.

    95 tools
    Browse all topics
    Back to all toolsSuggest an edit
    ratings
    discussions