Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • News
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    EveryDev.ai
    Sign inSubscribe
    Home
    Tools

    1,990+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1104
    • Coding995
    • Infrastructure429
    • Marketing408
    • Design354
    • Projects323
    • Analytics311
    • Research297
    • Testing194
    • Data166
    • Integration164
    • Security162
    • MCP152
    • Learning143
    • Communication126
    • Extensions118
    • Commerce112
    • Prompts109
    • Voice105
    • DevOps89
    • Web73
    • Finance19
    1. Home
    2. Tools
    3. nanochat
    nanochat icon

    nanochat

    AI Development Libraries

    End-to-end, open-source recipe to train and serve a small chat LLM (~560M params) for about $100 on one 8×H100 node, with tokenizer, pretrain→midtrain→SFT→optional RL, FastAPI web UI, and a KV-cached inference engine.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Open-source repository available for local use, modification, and learning.

    Engagement

    Available On

    Web
    API
    SDK

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    AI Development LibrariesLocal InferenceConversational Agents

    Alternatives

    flash-moeIBM Granite PlaygroundModular
    Developer
    Andrej KarpathySan Francisco, CAEst. 2024

    Updated Feb 2026

    About nanochat

    nanochat is an open-source, from-scratch codebase for training and serving your own small chat LLM on a tight budget. It’s designed to run a full “speedrun” on a single 8×H100 box in roughly a few hours (~$100): tokenization, base pretraining, mid-training on chat data, supervised finetuning, optional RL on GSM8K, evaluation, and a simple web UI to talk to the model.

    What it includes:

    • Tokenizer & data: a custom Rust BPE tokenizer and scripts to pull a shuffled subset of FineWeb-EDU for pretraining.
    • Training stages: base pretraining → mid-training (SmolTalk + MMLU aux + GSM8K) → SFT; optional RL (simplified GRPO) on GSM8K.
    • Evaluation: CORE / ChatCORE metrics plus task-specific scores (ARC-Easy/Challenge, MMLU, GSM8K, HumanEval), and an auto-generated report.md summarizing runs.
    • Inference & serving: a compact engine with KV caching (prefill + decode) and a FastAPI server with a lightweight chat web UI.
    • Scalability knob: model depth as the primary “slider” (e.g., d20 ≈ ~560M params), with auto-adjusted batch/accumulation.

    Use it to understand the full training loop, tweak data or hyperparameters, and stand up a private, hackable chat model end-to-end.

    nanochat - 1

    Community Discussions

    Be the first to start a conversation about nanochat

    Share your experience with nanochat, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Open-source repository available for local use, modification, and learning.

    • Full repository source code
    • Permissive open-source usage for experimentation
    • Reference implementation for an end-to-end chat LLM pipeline

    Capabilities

    Key Features

    • End-to-end LLM training pipeline (tokenizer → pretrain → mid-train → SFT → optional RL)
    • Custom Rust BPE tokenizer and data helpers
    • Evaluation scripts (CORE/ChatCORE
    • ARC
    • MMLU
    • GSM8K
    • HumanEval) with auto-generated report
    • KV-cached inference engine and FastAPI web UI for chat
    • Single-node speedrun scripts for one 8xH100 box; depth-based scaling knob

    Integrations

    GitHub
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate nanochat and help others make informed decisions.

    Developer

    Andrej Karpathy

    Andrej Karpathy publishes open-source machine learning demos and educational projects focused on deep learning and practical implementation. He builds compact, example-driven repositories that help developers learn by reading and running working code. His work emphasizes clarity and hands-on experimentation with modern ML models.

    Founded 2024
    San Francisco, CA
    10 employees

    Used by

    Open-source AI community
    Independent learners
    Read more about Andrej Karpathy
    WebsiteGitHubX / Twitter
    2 tools in directory

    Similar Tools

    flash-moe icon

    flash-moe

    A Mixture of Experts (MoE) implementation in Python, enabling efficient sparse model inference by routing inputs to specialized expert sub-networks.

    IBM Granite Playground icon

    IBM Granite Playground

    Interactive playground for testing and experimenting with IBM's Granite family of open-source AI foundation models.

    Modular icon

    Modular

    AI infrastructure platform with MAX framework, Mojo language, and Mammoth for GPU-portable GenAI serving across NVIDIA and AMD hardware.

    Browse all tools

    Related Topics

    AI Development Libraries

    Programming libraries and frameworks that provide machine learning capabilities, model integration, and AI functionality for developers.

    134 tools

    Local Inference

    Tools and platforms for running AI inference locally without cloud dependence.

    71 tools

    Conversational Agents

    AI chatbots and virtual assistants that can engage in natural dialogue.

    194 tools
    Browse all topics
    Back to all tools
    Explore AI Tools
    • AI Coding Assistants
    • Agent Frameworks
    • MCP Servers
    • AI Prompt Tools
    • Vibe Coding Tools
    • AI Design Tools
    • AI Database Tools
    • AI Website Builders
    • AI Testing Tools
    • LLM Evaluations
    Follow Us
    • X / Twitter
    • LinkedIn
    • Reddit
    • Discord
    • Threads
    • Bluesky
    • Mastodon
    • YouTube
    • GitHub
    • Instagram
    Get Started
    • About
    • Editorial Standards
    • Corrections & Disclosures
    • Community Guidelines
    • Advertise
    • Contact Us
    • Newsletter
    • Submit a Tool
    • Start a Discussion
    • Write A Blog
    • Share A Build
    • Terms of Service
    • Privacy Policy
    Explore with AI
    • ChatGPT
    • Gemini
    • Claude
    • Grok
    • Perplexity
    Agent Experience
    • llms.txt
    Theme
    With AI, Everyone is a Dev. EveryDev.ai © 2026
    27views
    Discussions