EveryDev.ai
Subscribe
Home
Tools

3,020+ AI tools

  • New
  • Trending
  • Featured
  • Compare
  • Arena
Categories
  • Agents2063
  • Coding1441
  • Infrastructure665
  • Marketing524
  • Projects470
  • Research437
  • Design408
  • Analytics371
  • MCP268
  • Security265
  • Testing255
  • Data249
  • Integration183
  • Prompts183
  • Communication172
  • Learning166
  • Extensions163
  • Voice146
  • Commerce132
  • DevOps115
  • Web84
  • Finance24
AI Tools by Topic
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
    1. Home
    2. Tools
    3. slime
    slime icon

    slime

    AI Infrastructure

    An open-source LLM post-training framework for RL scaling that connects Megatron training with SGLang rollout for high-performance reinforcement learning workflows.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully free and open-source under Apache License 2.0

    Engagement

    Available On

    CLI
    API
    Linux

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    AI InfrastructureAI Development LibrariesMulti-agent Systems

    Alternatives

    Colossal-AISentient FoundationZeroEval
    Developer
    THUDMTHUDM (Tsinghua University Data Mining group) builds large-s…

    Listed Jul 2026

    About slime

    slime is an open-source LLM post-training framework for reinforcement learning (RL) scaling, developed by THUDM (Zhipu AI's research group) and released under the Apache 2.0 license. It connects Megatron-LM for training with SGLang for rollout, providing a unified path for training data generation, reward computation, and environment interaction. The project is actively maintained on GitHub and reached v0.3.0 as of May 2026.

    What It Is

    slime provides two core capabilities: high-performance training via Megatron-LM and flexible data generation through custom interfaces and server-based engines. Its design goal is to keep these two capabilities tightly integrated without building a heavy stack of disconnected trainers, rollout services, and agent frameworks. All components — Megatron training, SGLang rollout, custom data generation, reward computation, verifier feedback, and environment interaction — flow through the same training/rollout/Data Buffer path.

    Architecture and Engine Pass-Through

    The framework is built around a three-module architecture:

    • Training (Megatron): Handles the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after each training step.
    • Rollout (SGLang + router): Generates new data including rewards and verifier outputs, storing results in the Data Buffer. Custom generate functions can wrap this with multi-turn loops, tool calls, environment/sandbox interaction, and verifier-based reward.
    • Data Buffer: A bridge module managing prompt initialization, custom data, and rollout generation methods including agentic workflows.

    slime passes Megatron arguments through directly and exposes SGLang arguments with a --sglang- prefix, so upstream training and serving optimizations remain accessible without additional abstraction layers.

    Production Validation and Model Support

    According to the project documentation, slime is the RL framework behind the GLM model family including GLM-5.2, GLM-5.1, GLM-5, GLM-4.7, GLM-4.6, and GLM-4.5. Beyond the GLM family, slime supports:

    • Qwen series: Qwen3.6, Qwen3.5, Qwen3Next, Qwen3MoE, Qwen3, Qwen2.5
    • DeepSeek V3 series: DeepSeek V3, V3.1, DeepSeek R1
    • Llama 3

    The framework has been exercised through large-scale training runs including a 256×H100 configuration for GLM-5.2 (744B-A40B MoE) and 128×H100 for DeepSeek R1.

    Advanced Features

    slime includes several production-grade capabilities:

    • BF16 training with FP8 rollout: Large MoE recipes use Megatron BF16 training state with SGLang FP8 rollout/inference
    • PD Disaggregation: Separate prefill/decode resource allocation for multi-turn and agentic workloads
    • Delta Weight Sync: Efficient weight updates for training/inference disaggregation
    • Speculative Decoding: Supported for rollout acceleration
    • On-Policy Distillation: Hindsight-based training signal extraction
    • Fault Tolerance and Reproducibility: First-class engineering concerns with documented CI coverage
    • AMD hardware support: Platform-specific tutorial available

    Ecosystem and Adoption Signal

    The project has attracted a growing ecosystem of frameworks built on slime as an RL substrate. Notable projects include Dressage (Alibaba Accio), Miles (RadixArk), vime (vLLM project), Relax (RedAI Infra), OpenClaw-RL, P1 (physics reasoning), RLVE, TritonForge, APRIL, qqr (Alibaba NLP), and ART (AWS Bedrock AgentCore Runtime). The GitHub repository reports 7,174 stars and 1,018 forks as of the data snapshot.

    Update: v0.3.0

    The latest release is v0.3.0, published May 31, 2026. The project was created in June 2025 and has seen rapid development, with the repository last pushed to on June 30, 2026. The v0.1.0 release blog post, titled "Redefining High-Performance RL Training Frameworks," and an introductory post on the LMSYS blog describe the framework's design philosophy and differentiation from alternatives like veRL and OpenRLHF.

    slime - 1

    Community Discussions

    Be the first to start a conversation about slime

    Share your experience with slime, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully free and open-source under Apache License 2.0

    • Full source code access
    • Megatron + SGLang integration
    • Agentic RL workflows
    • Community support via GitHub Issues
    • All advanced features included

    Capabilities

    Key Features

    • Megatron-LM training integration
    • SGLang-native rollout backend
    • Unified training/rollout/Data Buffer path
    • Custom data generation interfaces
    • Agentic RL workflows (multi-agent, coding agent, search/RAG)
    • BF16 training with FP8 rollout
    • PD disaggregation for prefill/decode separation
    • Delta weight sync
    • Speculative decoding
    • On-policy distillation
    • Fault tolerance and reproducibility
    • Observability and trace viewer
    • Profiling support
    • Low precision training and rollout
    • SGLang config YAML for topology control
    • External rollout engine support
    • Dense and MoE model support
    • Fully-async rollout
    • CPU unit tests and GPU end-to-end CI
    • AMD hardware platform support

    Integrations

    Megatron-LM
    SGLang
    DeepSeek R1
    Qwen3
    GLM model family
    Llama 3
    Gemma4
    vLLM (via vime)
    Ray
    AWS Bedrock AgentCore Runtime
    E2B sandbox
    Kubernetes
    bwrap
    API Available
    View Docs

    Ratings & Reviews

    No ratings yet

    Be the first to rate slime and help others make informed decisions.

    Developer

    THUDM

    THUDM (Tsinghua University Data Mining group) builds large-scale AI research tools and benchmarks, including AgentBench and AgentRL. The group develops open-source frameworks for evaluating and training LLM-based agents across diverse real-world environments. Their work spans language model evaluation, reinforcement learning for agents, and multimodal AI systems, with publications at top venues like ICLR.

    Read more about THUDM
    WebsiteGitHubX / Twitter
    2 tools in directory

    Similar Tools

    Colossal-AI icon

    Colossal-AI

    An open-source distributed deep learning framework that maximizes runtime performance for large neural networks using advanced parallelism techniques.

    Sentient Foundation icon

    Sentient Foundation

    Open-source AGI foundation uniting builders, researchers, and communities to develop transparent, collaborative artificial general intelligence.

    ZeroEval icon

    ZeroEval

    Open-source evaluation framework for testing large language models with zero-shot prompting on reasoning and coding tasks.

    Browse all tools

    Related Topics

    AI Infrastructure

    Infrastructure designed for deploying and running AI models.

    302 tools

    AI Development Libraries

    Programming libraries and frameworks that provide machine learning capabilities, model integration, and AI functionality for developers.

    244 tools

    Multi-agent Systems

    Platforms for creating and managing teams of AI agents that can collaborate.

    232 tools
    Browse all topics
    Back to all toolsSuggest an edit
    ratings
    discussions