# Atropos

> An async-first environment microservice framework for reinforcement learning with LLMs, enabling scalable collection and evaluation of LLM trajectories across diverse environments.

Atropos is Nous Research's open-source LLM Reinforcement Learning Gym — an environment microservice framework for async RL with large language models. It provides a flexible, scalable, and standardized platform to accelerate LLM-based RL research across diverse, interactive settings. The framework supports collecting, distributing, and evaluating LLM trajectories through dataset environments, online game environments, RLAIF/RLHF pipelines, multi-turn RL, code execution, and multimodal tasks.

- **Environment Microservice Architecture** — *Each environment runs as an independent service, sending trajectory data to a central API that trainers pull batches from, enabling fully async and distributed RL loops.*
- **Diverse Environment Support** — *Includes dataset environments (GSM8K, MMLU), interactive games (Blackjack, Taxi), RLAIF/RLHF pipelines, multi-turn tool calling, code execution (MBPP, HumanEval), and multimodal tasks (OCR VQA, CLEVR).*
- **OpenAI-Compatible API Integration** — *Works with any OpenAI-compatible inference endpoint including vLLM, SGLang, OpenAI, Together AI, and OpenRouter; no GPU required for local environment development.*
- **Trainer Integrations** — *Native integrations with Axolotl (via plugin) and Tinker for LoRA/QLoRA fine-tuning, plus an included example trainer for reference implementations.*
- **On-Policy Distillation (OPD) Support** — *Carries distillation arrays through `ScoredDataGroup` and API endpoints, enabling teacher-student distillation workflows with `TeacherDistillationEnv`.*
- **Offline Data Generation** — *Use `atropos-sft-gen` and `atropos-dpo-gen` CLI tools to collect rollouts and convert them into SFT or DPO training datasets with rejection sampling controls.*
- **Debugging & Visualization Tools** — *The `process` subcommand runs inference-only rollouts with JSONL output, auto-generated HTML visualizations, and optional Weights & Biases logging; `view-run` launches a Gradio UI for batch inspection.*
- **Easy Installation** — *Install via `pip install atroposlib` or clone the repo and use `pip install -e .[all]` for full development setup with Python 3.10+.*
- **Proven Results** — *Demonstrated 4.6x improvement on parallel tool-calling tasks and 2.5x improvement on financial fundamentals prediction using Atropos-trained models.*
- **Community Environments** — *A `environments/community/` directory and contribution guide make it easy to add and share new RL environments with the broader research community.*

## Features
- Async-first environment microservice framework
- Trajectory API for collecting and distributing LLM rollouts
- Dataset environments (GSM8K, MMLU, custom HuggingFace datasets)
- Online game environments (Blackjack, Taxi, text-based games)
- RLAIF and RLHF support
- Multi-turn RL for complex multi-step interactions
- Code execution environments (MBPP, HumanEval)
- Multimodal environments (OCR VQA, CLEVR)
- OpenAI-compatible API endpoint support
- vLLM and SGLang native server integrations
- Axolotl trainer plugin integration
- Tinker LoRA trainer integration
- On-Policy Distillation (OPD) support
- TeacherDistillationEnv for teacher-student distillation
- atropos-sft-gen and atropos-dpo-gen CLI tools
- process subcommand for inference-only rollouts
- JSONL output and HTML visualization
- Weights & Biases logging
- Gradio UI via view-run
- Slurm support for distributed inference
- Pre-commit hooks and contribution guide
- MIT License

## Integrations
vLLM, SGLang, OpenAI API, Together AI, OpenRouter, Axolotl, Tinker, Weights & Biases, HuggingFace, Slurm, Gradio

## Platforms
WINDOWS, WEB, API, DEVELOPER_SDK, CLI

## Pricing
Open Source

## Version
v0.4.0

## Links
- Website: https://github.com/NousResearch/atropos
- Documentation: https://github.com/NousResearch/atropos/blob/main/atroposlib/envs/README.md
- Repository: https://github.com/NousResearch/atropos
- EveryDev.ai: https://www.everydev.ai/tools/atropos
