EveryDev.ai
Sign inSubscribe
Explore AI Tools
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Podcasts
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    Home
    Tools

    2,480+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1596
    • Coding1181
    • Infrastructure526
    • Marketing447
    • Design427
    • Projects384
    • Research357
    • Analytics331
    • Testing221
    • MCP216
    • Data205
    • Security196
    • Integration169
    • Learning154
    • Communication146
    • Prompts140
    • Extensions137
    • Commerce123
    • Voice122
    • DevOps99
    • Web77
    • Finance21
    1. Home
    2. Tools
    3. SWE-smith
    SWE-smith icon

    SWE-smith

    Agent Harness
    Featured

    An open-source toolkit for generating training data and task instances for software engineering agents, enabling fine-tuning of LMs on real GitHub repositories.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully free and open-source under the MIT License. All features, dataset, and model weights are freely available.

    Engagement

    Available On

    Windows
    macOS
    API
    SDK
    CLI

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Agent HarnessAI Development LibrariesHuman-in-the-Loop Training

    Alternatives

    VerifiersOpenHarnessOpenTraces
    Developer
    SWE-benchPrinceton, NJEst. 2023

    Listed May 2026

    About SWE-smith

    SWE-smith is an open-source toolkit developed by researchers at Stanford University, Princeton Language & Intelligence, and Alibaba Qwen for generating training data for software engineering (SWE) agents. Released in April 2025, it lets users turn any GitHub repository into a SWE-gym and synthesize hundreds to thousands of task instances — including file localization, program repair, and SWE-bench-style tasks — for training language models. The project was accepted as a Spotlight paper at NeurIPS 2025 Datasets & Benchmarks Track.

    What It Is

    SWE-smith is a data generation pipeline and training toolkit targeting the problem of scarce, high-quality training data for software engineering agents. It automates the process of creating execution environments from GitHub repositories, synthesizing bug-inducing task instances, filtering them by unit test breakage, and generating natural-language issue descriptions. The result is a scalable dataset factory that can produce task instances for any Python-based GitHub repository.

    How the Pipeline Works

    The SWE-smith workflow follows four main steps:

    • Environment construction: Wrap a GitHub repository in a Docker-based execution environment.
    • Task synthesis: Automatically generate code mutations that introduce bugs or regressions.
    • Harness filtering: Keep only tasks that break one or more unit tests, ensuring task validity.
    • Issue generation: Produce natural-language issue descriptions for each task, mimicking real GitHub issues.

    The toolkit requires Docker and was developed and tested on Ubuntu 22.04.4 LTS. The project explicitly states it does not plan to support Windows or macOS.

    Dataset and Model Resources

    The SWE-bench organization publishes several artifacts alongside the toolkit:

    • 52,000+ task instances across 128 popular GitHub repositories, available on Hugging Face as SWE-bench/SWE-smith.
    • SWE-agent-LM-32B, a fine-tuned version of Qwen 2.5 Coder trained on SWE-smith data, which the project reports achieves 40.2% pass@1 on SWE-bench Verified — described by the authors as a +32% jump over the base model.
    • 26,000 SWE-agent trajectories, including the 5,000 used to train SWE-agent-LM-32B.
    • 250+ Docker environments, one per repository represented in the dataset.

    Training Integrations

    SWE-smith has been used for two training paradigms according to the project documentation:

    • Supervised fine-tuning of Qwen 2.5 Coder into SWE-agent-LM-32B using the SWE-agent framework.
    • GRPO-style reinforcement learning using the SkyRL framework from NovaSky-AI.

    The Python API makes it straightforward to load task instances from Hugging Face Datasets and spin up Docker containers pre-initialized with each task, leaving the training loop to the user.

    Update: NeurIPS 2025 Spotlight and Open-Source Release

    SWE-smith was publicly released on April 30, 2025, with the full toolkit, dataset, model weights, and trajectories open-sourced under the MIT license. The paper was accepted as a Spotlight at NeurIPS 2025 Datasets & Benchmarks Track (arXiv:2504.21798). The repository was last pushed in May 2026, indicating active ongoing development. The project is part of a broader SWE-bench ecosystem that includes SWE-bench, SWE-agent, Mini-SWE-Agent, SWE-ReX, and sb-cli.

    SWE-smith - 1

    Community Discussions

    Be the first to start a conversation about SWE-smith

    Share your experience with SWE-smith, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully free and open-source under the MIT License. All features, dataset, and model weights are freely available.

    • Full toolkit source code under MIT License
    • 52k+ task instances on Hugging Face
    • 250+ Docker environments
    • SWE-agent-LM-32B model weights
    • 26k SWE-agent trajectories

    Capabilities

    Key Features

    • Turn any GitHub repository into a SWE-gym execution environment
    • Synthesize unlimited task instances (file localization, program repair, SWE-bench-style)
    • Filter tasks by unit test breakage for quality assurance
    • Generate natural-language issue descriptions for tasks
    • 52k+ pre-built task instances across 128 GitHub repositories
    • Docker-based isolated execution environments
    • Python API for loading tasks and spinning up containers
    • Supports supervised fine-tuning and GRPO-style reinforcement learning
    • Compatible with SWE-agent training framework
    • Pre-trained SWE-agent-LM-32B model weights available on Hugging Face

    Integrations

    Docker
    Hugging Face Datasets
    SWE-agent
    SkyRL
    Qwen 2.5 Coder
    GitHub
    SWE-bench
    SWE-ReX
    API Available
    View Docs

    Demo Video

    SWE-smith Demo Video
    Watch on YouTube

    Reviews & Ratings

    No ratings yet

    Be the first to rate SWE-smith and help others make informed decisions.

    Developer

    SWE-bench

    SWE-bench builds open-source benchmarks and tooling for evaluating large language models on real-world software engineering tasks. The project originates from Princeton and Stanford researchers, led by Carlos E. Jimenez and John Yang. It produces benchmark datasets, evaluation harnesses, fine-tuned models, and companion tools like SWE-agent and SWE-smith to advance AI software engineering research.

    Founded 2023
    Princeton, NJ
    15 employees

    Used by

    OpenAI
    Anthropic
    Google DeepMind
    Meta AI
    +1 more
    Read more about SWE-bench
    WebsiteGitHubX / Twitter
    2 tools in directory

    Similar Tools

    Verifiers icon

    Verifiers

    An open-source Python library by Prime Intellect for creating environments to train and evaluate LLMs using reinforcement learning.

    OpenHarness icon

    OpenHarness

    A composable, open-source TypeScript SDK for building powerful AI agent harnesses with stateless primitives, middleware, and hierarchical subagents on any model provider.

    OpenTraces icon

    OpenTraces

    A CLI tool to parse, sanitize, and commit AI agent session traces to HuggingFace Hub for training, evaluation, and open data sharing.

    Browse all tools

    Related Topics

    Agent Harness

    Infrastructure, orchestrators, and task runners that wrap around LLM coding agents — covering session management, context delivery, worktree isolation, architecture enforcement, and issue-to-PR pipelines.

    83 tools

    AI Development Libraries

    Programming libraries and frameworks that provide machine learning capabilities, model integration, and AI functionality for developers.

    189 tools

    Human-in-the-Loop Training

    Platforms that connect organizations with vetted human experts to annotate, label, evaluate, and align AI models, ensuring high-quality training datasets and accurate model evaluation through human judgment.

    27 tools
    Browse all topics
    Back to all tools
    Discussions