EveryDev.ai
Sign inSubscribe
Home
Tools

2,679+ AI tools

  • New
  • Trending
  • Featured
  • Compare
  • Arena
Categories
  • Agents1810
  • Coding1290
  • Infrastructure599
  • Marketing466
  • Design458
  • Projects431
  • Research402
  • Analytics351
  • MCP242
  • Security242
  • Testing238
  • Data229
  • Integration177
  • Prompts160
  • Learning159
  • Communication154
  • Extensions150
  • Voice130
  • Commerce125
  • DevOps108
  • Web80
  • Finance21
AI Tools by Topic
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
    1. Home
    2. Tools
    3. autoresearch
    autoresearch icon

    autoresearch

    Autonomous Systems
    Featured

    Autonomous AI agent that iteratively experiments on single-GPU LLM training code overnight while you sleep.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully open source under MIT license. No cost to use, modify, or distribute.

    Engagement

    Available On

    Linux
    SDK

    Resources

    WebsiteGitHubllms.txt

    Topics

    Autonomous SystemsAgent FrameworksAI Infrastructure

    Alternatives

    SciloopDevinDevin Desktop
    Developer
    Andrej KarpathySan Francisco, CAEst. 2024

    Updated May 2026

    About autoresearch

    autoresearch is an open-source framework that puts an AI agent in charge of running machine learning experiments autonomously. You point a coding agent (Claude, Codex, or similar) at a small but real LLM training setup, and the agent modifies the training code, runs a 5-minute training experiment, checks whether the validation metric improved, keeps or discards the change, and repeats—producing a log of experiments and (hopefully) a better model by morning.

    The training base is a simplified single-GPU implementation of nanochat (a GPT-style model). The human's role shifts from writing Python to writing program.md—a Markdown file that acts as the agent's standing instructions and research strategy. The agent exclusively edits train.py, which contains the full model architecture, optimizer, and training loop.

    • Fixed Time Budget - Every experiment runs for exactly 5 wall-clock minutes, making results directly comparable regardless of architecture or hyperparameter changes
    • Single Metric - Validation bits-per-byte (val_bpb) is the objective; lower is better and vocab-size-independent, so architectural changes are fairly compared
    • Single File Editing - The agent only modifies train.py, keeping diffs small and reviewable
    • program.md Interface - Human researchers guide the agent by editing a Markdown instruction file rather than Python code
    • Self-Contained - No distributed training, no complex configs; one GPU, one file, one metric
    • Autonomous Loop - Run ~12 experiments per hour; ~100 experiments while you sleep
    • MIT Licensed - Fully open source with no restrictions

    To get started, clone the repository, install dependencies via uv, run prepare.py once to download data and train a BPE tokenizer, then spin up your AI coding agent pointed at program.md.

    autoresearch - 1

    Community Discussions

    Be the first to start a conversation about autoresearch

    Share your experience with autoresearch, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully open source under MIT license. No cost to use, modify, or distribute.

    • Full source code access
    • Autonomous AI research loop
    • MIT license

    Capabilities

    Key Features

    • Autonomous AI agent loop: modify → train → evaluate → keep or discard
    • Fixed 5-minute wall-clock training budget per experiment
    • Validation bits-per-byte (val_bpb) as the single comparable metric
    • Agent-editable train.py with full GPT model, Muon+AdamW optimizer, and training loop
    • Human-editable program.md for setting agent research strategy
    • ~100 experiments possible in a single overnight run
    • Single NVIDIA GPU support (tested on H100)
    • MIT license — fully open source
    • Built on nanochat, a minimal GPT training codebase

    Integrations

    Claude (Anthropic)
    OpenAI Codex
    uv package manager
    PyTorch
    nanochat

    Reviews & Ratings

    No ratings yet

    Be the first to rate autoresearch and help others make informed decisions.

    Developer

    Andrej Karpathy

    Andrej Karpathy publishes open-source machine learning demos and educational projects focused on deep learning and practical implementation. He builds compact, example-driven repositories that help developers learn by reading and running working code. His work emphasizes clarity and hands-on experimentation with modern ML models.

    Founded 2024
    San Francisco, CA
    10 employees

    Used by

    Open-source AI community
    Independent learners
    Read more about Andrej Karpathy
    WebsiteGitHubX / Twitter
    2 tools in directory

    Similar Tools

    Sciloop icon

    Sciloop

    End-to-end AI scientist that automates ML research workflows from ideation to experimentation and paper drafting.

    Devin icon

    Devin

    Devin is an AI software engineer that autonomously plans, codes, tests, and ships software, handling complex engineering tasks from code migrations to bug fixes and PR reviews.

    Devin Desktop icon

    Devin Desktop

    Devin Desktop is an AI software engineer by Cognition that autonomously plans and executes complex engineering tasks including code migrations, bug fixes, PR reviews, and more.

    Browse all tools

    Related Topics

    Autonomous Systems

    AI agents that can perform complex tasks with minimal human guidance.

    260 tools

    Agent Frameworks

    Tools and platforms for building and deploying custom AI agents.

    381 tools

    AI Infrastructure

    Infrastructure designed for deploying and running AI models.

    263 tools
    Browse all topics
    Back to all tools
    35views
    1upvote
    Discussions