EveryDev.ai
Subscribe
Home
Tools

2,911+ AI tools

  • New
  • Trending
  • Featured
  • Compare
  • Arena
Categories
  • Agents1815
  • Coding1295
  • Infrastructure600
  • Marketing467
  • Projects433
  • Research403
  • Analytics351
  • Design338
  • Security243
  • MCP242
  • Testing238
  • Data230
  • Integration178
  • Prompts160
  • Learning159
  • Communication154
  • Extensions150
  • Voice130
  • Commerce125
  • DevOps108
  • Web80
  • Finance21
AI Tools by Topic
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
    1. Home
    2. Tools
    3. TMax
    TMax icon

    TMax

    Autonomous Systems

    An open-source research codebase for training, evaluating, and deploying simple yet powerful terminal-using LLM agents, covering data generation, SFT, and RL training pipelines.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully free and open-source under Apache 2.0. Self-host and use the codebase, models, and datasets at no cost.

    Engagement

    Available On

    CLI
    API
    SDK

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Autonomous SystemsAgent FrameworksAI Development Libraries

    Alternatives

    ShinkaEvolveOpen Agent BuilderOSM - Open Skills Manager
    Developer
    WAISeattle, WAEst. 2026

    Listed Jun 2026

    About TMax

    TMax is an open-source project from AllenAI (Allen Institute for AI) focused on building simple, powerful terminal-using agents. Released under the Apache 2.0 license, the codebase covers the full lifecycle of terminal agent development: synthetic data generation, supervised fine-tuning (SFT), reinforcement learning (RL) training, and evaluation against benchmarks like Terminal-Bench and SWE-bench.

    What It Is

    TMax is a research framework for training LLM-based agents that interact with a terminal (bash shell) to complete tasks. The project trains a series of models — referred to as the "tmax" series — and provides all the tooling needed to reproduce or extend that work. It is accompanied by a paper on arXiv (2606.23321) and a blog post from the WAI organization. The codebase is written primarily in Python and managed with uv for dependency handling.

    Four-Stage Pipeline Architecture

    The repository is organized around four distinct stages:

    • Data generation (rl_data/): A scalable, diversity-aware pipeline that synthesizes terminal-agent tasks by sampling from structured compositional axes. Tasks are packaged as self-contained Apptainer/Docker environments with programmatic verifiers, then solved at pass@k and published to Hugging Face Hub.
    • Agent (Vanillux2Agent/): A direct LiteLLM agent built on the vanillux prompt harness — derived from mini-SWE-agent prompts — with a bash tool schema, submit marker, format-error recovery, and output truncation. It executes commands through Harbor's active environment.
    • Training (training/open-instruct/): A fork of AllenAI's open-instruct repository with fixes for Qwen 3.5 and terminal-agent training. SFT and DPPO RL launch scripts for tmax models are provided under training/open-instruct/scripts/tmax/.
    • Evaluation (scripts/ + beaker_configs/): Shell/Slurm launchers and a Beaker pipeline that serves a model with vLLM and runs Harbor datasets against it.

    Task Data and the Harbor Ecosystem

    TMax ships a full 15k task corpus in Harbor format, published on the Harbor registry as tmax/TMax-15K-Harbor. This corpus combines a legacy 10k set of self-contained tasks with 5k newer intricate multi-modal tasks. Every task includes a self-contained Harbor environment and a programmatic verifier, enabling any agent or model to be evaluated directly without regenerating data. The Harbor framework supports both local Docker and cloud-based Daytona sandbox execution.

    Requirements and Setup Path

    Running TMax requires:

    • uv for Python dependency management
    • An LLM API key (e.g., GEMINI_API_KEY) or a local vLLM/Ollama/OpenAI-compatible endpoint
    • apptainer on PATH for building and running task containers (data generation only)
    • A Dockerhub login and personal access token for training at scale
    • HF_TOKEN for Hugging Face upload and gated model access
    • A container runtime (Docker or Daytona) for evaluating on the published Harbor dataset

    The quickstart involves running uv sync, then using provided shell scripts to generate tasks, solve them, analyze pass@k statistics, train models, and run evaluations.

    Update: Initial Release

    The repository was created in March 2026 and last updated in June 2026, with the initial release of the codebase, models, and the accompanying arXiv paper ("Tmax: A simple recipe for terminal agents"). The authors include Hamish Ivison, Junjie Oscar Yin, Rulin Shao, Teng Xiao, Nathan Lambert, and Hannaneh Hajishirzi. Models and datasets are published on Hugging Face under the allenai/tmax collection.

    TMax - 1

    Community Discussions

    Be the first to start a conversation about TMax

    Share your experience with TMax, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully free and open-source under Apache 2.0. Self-host and use the codebase, models, and datasets at no cost.

    • Full codebase access under Apache 2.0
    • Data generation pipeline
    • SFT and RL training scripts
    • Evaluation pipeline (Terminal-Bench, SWE-bench)
    • 15k Harbor task corpus

    Capabilities

    Key Features

    • Terminal-using LLM agent training and evaluation
    • Compositional synthetic task data generation pipeline
    • Pass@k task solving with programmatic verifiers
    • SFT and DPPO RL training via open-instruct fork
    • Vanillux2Agent with bash tool schema and format-error recovery
    • 15k Harbor task corpus with self-contained environments
    • vLLM model serving integration
    • Beaker and Slurm evaluation pipeline
    • Daytona and Docker sandbox support
    • Hugging Face Hub dataset publishing

    Integrations

    Hugging Face Hub
    vLLM
    LiteLLM
    Harbor framework
    Daytona
    Docker
    Apptainer
    Beaker
    Slurm
    open-instruct
    Qwen 3.5
    Terminal-Bench
    SWE-bench
    API Available
    View Docs

    Ratings & Reviews

    No ratings yet

    Be the first to rate TMax and help others make informed decisions.

    Developer

    WAI

    WAI is a research organization formed by NLP / ML / System PhDs from the University of Washington. We work on agentic research across the full stack — data, training, and evaluation. And we share what we build: code, data, systems, and write-ups.

    Founded 2026
    Seattle, WA
    5 employees

    Used by

    Allen Institute for AI (Research…
    Read more about WAI
    WebsiteX / Twitter
    1 tool in directory

    Similar Tools

    ShinkaEvolve icon

    ShinkaEvolve

    An open-source framework that combines LLMs with evolutionary algorithms to automate scientific code discovery and optimization.

    Open Agent Builder icon

    Open Agent Builder

    An open-source framework by Firecrawl for building AI agents with web scraping and data extraction capabilities.

    OSM - Open Skills Manager icon

    OSM - Open Skills Manager

    OSM is an open-source agent skills registry and CLI tool for discovering, installing, and publishing reusable skills for AI agents.

    Browse all tools

    Related Topics

    Autonomous Systems

    AI agents that can perform complex tasks with minimal human guidance.

    304 tools

    Agent Frameworks

    Tools and platforms for building and deploying custom AI agents.

    447 tools

    AI Development Libraries

    Programming libraries and frameworks that provide machine learning capabilities, model integration, and AI functionality for developers.

    228 tools
    Browse all topics
    Back to all toolsSuggest an edit
    ratings
    discussions