EveryDev.ai
Sign inSubscribe
Explore AI Tools
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Podcasts
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    Home
    Tools

    2,480+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1596
    • Coding1181
    • Infrastructure526
    • Marketing447
    • Design427
    • Projects384
    • Research357
    • Analytics331
    • Testing221
    • MCP216
    • Data205
    • Security196
    • Integration169
    • Learning154
    • Communication146
    • Prompts140
    • Extensions137
    • Commerce123
    • Voice122
    • DevOps99
    • Web77
    • Finance21
    1. Home
    2. Tools
    3. Miles
    Miles icon

    Miles

    AI Infrastructure

    Enterprise-grade reinforcement learning framework for large-scale LLM and VLM post-training, featuring high-performance rollout, low-precision training, and production stability.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully open-source under Apache License 2.0. Free to use, modify, and distribute.

    Engagement

    Available On

    CLI
    API
    SDK

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    AI InfrastructureMulti-agent SystemsLLM Orchestration

    Alternatives

    RayCXO AGISwarms AI
    Developer
    radixarkRedwood City, CAEst. 2025$100M raised

    Listed May 2026

    About Miles

    Miles is an open-source reinforcement learning framework built for enterprise-scale post-training of large language models (LLMs) and vision-language models (VLMs). It is a fork of the slime project, developed jointly by InfiXAI, Ant Group, the SGLang RL Team, and the Miles community. The project launched in November 2025 and is actively maintained under the Apache License 2.0.

    What It Is

    Miles sits at the intersection of research-grade RL and production-grade reliability. It integrates SGLang for high-throughput rollout and Megatron-LM for scalable distributed training, targeting the system-level challenges that cause instability and inefficiency when applying reinforcement learning to models at the 1TB+ parameter scale. The framework is designed to be a unified entry point for complex RL workloads including multi-turn interaction, vision-language training, reasoning, coding agents, and multi-agent co-evolution.

    Core Technical Architecture

    Miles addresses several fundamental problems in large-scale RL training through system-level innovations:

    • Unified FP8 Pipeline: End-to-end FP8 sampling and training that eliminates quantization-induced discrepancy between rollout and training, preventing RL collapse in large MoE models.
    • Rollout Routing Replay (R3): Records expert routing decisions during SGLang inference and replays them during Megatron training to ensure bit-wise expert alignment in MoE architectures like Qwen3 and DeepSeek-V3.
    • INT4 QAT Support: Full-stack INT4 W4A16 Quantization-Aware Training pipeline, inspired by the Kimi K2-Thinking report, enabling 1TB-scale models to fit into single-machine VRAM (e.g., NVIDIA H200) and doubling rollout efficiency.
    • Zero-Copy Weight Sync: Optimized weight refit via CUDA IPC zero-copy mapping, async tensor gathering, and bucketed flattening, reducing sync time by 50% compared to standard HTTP/RPC transfers (per project documentation).
    • Speculative RL Training: Uses an Online SFT Draft Model that updates during RL to prevent policy drift, achieving 25%+ rollout speedup according to the project's own benchmarks.

    Model Support and Training Scenarios

    Miles supports a broad range of state-of-the-art model families:

    • DeepSeek: R1, V3, V3.2
    • Qwen: 2, 2.5, 3
    • Llama: 3, 3.1, 3.3, 4
    • Gemma: 2, 3, 3N
    • GLM: 4.5, 4.6, 4.7
    • MiniMax: M2, M2.1
    • Others: Mistral, Mixtral, Phi, gpt-oss, and any model supported by SGLang and Megatron

    Training scenarios span multi-turn interaction, unified VLM/LLM workflows, reasoning and coding tasks, and multi-agent co-evolutionary frameworks such as MrlX.

    Setup Path

    Miles recommends using its official Docker image for best performance and compatibility. It can also be installed from source via pip. Training is launched through a unified train.py entry point with command-line arguments for configuring cluster resources, training backends (Megatron/FSDP), SGLang inference optimization, and RL algorithmic hyperparameters. A detailed argument guide and Quick Start documentation are available in the repository's docs/ directory.

    Update: Active Development Through Early 2026

    The project has seen rapid iteration since its November 2025 launch. Notable recent additions include:

    • [2026/02] Detailed command-line argument guide for Miles server configuration
    • [2026/01] INT4 QAT pipeline for single-machine 1TB model training
    • [2026/01] Unified VLM/LLM multi-turn training support
    • [2026/01] MrlX multi-agent co-evolutionary framework integration
    • [2025/12] Rollout Routing Replay (R3) for MoE RL stability

    The roadmap lists planned support for Diffusion RL, Omni RL, Diffusion LLM RL, and elastic resource scheduling. The repository had 1,378 stars and 220 forks as of late May 2026, per GitHub metadata.

    Miles - 1

    Community Discussions

    Be the first to start a conversation about Miles

    Share your experience with Miles, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully open-source under Apache License 2.0. Free to use, modify, and distribute.

    • Full Miles framework source code
    • Unified FP8 training and rollout
    • INT4 QAT pipeline
    • SGLang and Megatron-LM integration
    • Multi-turn LLM and VLM training

    Capabilities

    Key Features

    • Unified FP8 end-to-end training and rollout pipeline
    • INT4 Quantization-Aware Training (QAT) for 1TB+ models
    • Rollout Routing Replay (R3) for MoE RL stability
    • Zero-copy weight synchronization via CUDA IPC
    • Speculative RL training with Online SFT Draft Model
    • Multi-turn LLM and VLM training support
    • Multi-agent co-evolutionary RL (MrlX)
    • Truncated and Masked Importance Sampling (TIS/MIS)
    • Partial rollout and over-sampling for long-tail RL
    • Support for DeepSeek, Qwen, Llama, Gemma, GLM, MiniMax, Mistral, Phi
    • SGLang integration for high-throughput rollout
    • Megatron-LM integration for scalable distributed training
    • FSDP training backend support
    • Docker image for production deployment
    • Detailed command-line argument configuration

    Integrations

    SGLang
    Megatron-LM
    FSDP
    FlashAttention-3
    DeepGEMM
    NVIDIA Transformer Engine
    Docker
    CUDA IPC
    MrlX
    slime
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate Miles and help others make informed decisions.

    Developer

    radixark

    radixark builds enterprise-grade open-source infrastructure for large-scale AI model training. The organization develops Miles, a reinforcement learning framework for LLM and VLM post-training, in collaboration with InfiXAI, Ant Group, and the SGLang RL Team. Miles focuses on production stability, low-precision training, and high-throughput rollout for models at the 1TB+ parameter scale.

    Founded 2025
    Redwood City, CA
    $100M raised
    30 employees

    Used by

    DeepLearning.AI (Educational partner)
    LMSYS Org (Collaboration)
    Open-source developers using SGLang
    Read more about radixark
    WebsiteGitHub
    1 tool in directory

    Similar Tools

    Ray icon

    Ray

    Ray is an open-source AI compute engine that pairs a distributed Python runtime with libraries for training, tuning, serving, and reinforcement learning.

    CXO AGI icon

    CXO AGI

    CXO AGI builds the superintelligence operating system for enterprise, delivering deterministic AI systems with predictable, production-grade reliability.

    Swarms AI icon

    Swarms AI

    Enterprise-grade multi-agent framework for building, deploying, and scaling autonomous AI agent swarms with advanced collaboration and communication protocols.

    Browse all tools

    Related Topics

    AI Infrastructure

    Infrastructure designed for deploying and running AI models.

    240 tools

    Multi-agent Systems

    Platforms for creating and managing teams of AI agents that can collaborate.

    175 tools

    LLM Orchestration

    Platforms and frameworks for designing, managing, and deploying complex LLM workflows with visual interfaces, allowing for the coordination of multiple AI models and services.

    130 tools
    Browse all topics
    Back to all tools
    Discussions