# nanochat

> End-to-end, open-source recipe to train and serve a small chat LLM (~560M params) for about $100 on one 8×H100 node, with tokenizer, pretrain→midtrain→SFT→optional RL, FastAPI web UI, and a KV-cached inference engine.

nanochat is an open-source, from-scratch codebase for training and serving your own small chat LLM on a tight budget. It’s designed to run a full “speedrun” on a single 8×H100 box in roughly a few hours (~$100): tokenization, base pretraining, mid-training on chat data, supervised finetuning, optional RL on GSM8K, evaluation, and a simple web UI to talk to the model.

What it includes:
- **Tokenizer & data**: a custom Rust BPE tokenizer and scripts to pull a shuffled subset of FineWeb-EDU for pretraining.
- **Training stages**: base pretraining → mid-training (SmolTalk + MMLU aux + GSM8K) → SFT; optional RL (simplified GRPO) on GSM8K.
- **Evaluation**: CORE / ChatCORE metrics plus task-specific scores (ARC-Easy/Challenge, MMLU, GSM8K, HumanEval), and an auto-generated `report.md` summarizing runs.
- **Inference & serving**: a compact engine with KV caching (prefill + decode) and a FastAPI server with a lightweight chat web UI.
- **Scalability knob**: model depth as the primary “slider” (e.g., d20 ≈ ~560M params), with auto-adjusted batch/accumulation.

Use it to understand the full training loop, tweak data or hyperparameters, and stand up a private, hackable chat model end-to-end.

## Features
- End-to-end LLM training pipeline (tokenizer → pretrain → mid-train → SFT → optional RL)
- Custom Rust BPE tokenizer and data helpers
- Evaluation scripts (CORE/ChatCORE
- ARC
- MMLU
- GSM8K
- HumanEval) with auto-generated report
- KV-cached inference engine and FastAPI web UI for chat
- Single-node speedrun scripts for one 8xH100 box; depth-based scaling knob

## Integrations
GitHub

## Platforms
WEB, API, DEVELOPER_SDK

## Pricing
Open Source

## Links
- Website: https://github.com/karpathy/nanochat
- Documentation: https://github.com/karpathy/nanochat/discussions
- Repository: https://github.com/karpathy/nanochat
- EveryDev.ai: https://www.everydev.ai/tools/nanochat
