SmallCode
A terminal-native AI coding agent optimized for small local LLMs (8B-35B parameters), with intelligent context management, forgiving tool parsing, and patch-first editing.
At a Glance
Fully free and open-source under the MIT License. Install via npm or prebuilt binaries.
Engagement
Available On
Alternatives
Listed May 2026
About SmallCode
SmallCode is an open-source, terminal-native coding agent built specifically to extract useful work from local language models in the 8B–35B parameter range running on consumer hardware. It was created by GitHub user Doorman11991, published under the MIT License, and has accumulated over 1,600 stars since its initial release in May 2026. The project's core premise is that frontier-model tools like OpenCode assume capabilities — massive context windows, reliable JSON tool calling, and high reasoning depth — that small local models simply don't have.
What It Is
SmallCode is a CLI-based coding agent that compensates for the architectural limitations of small LLMs through a layered set of intelligent subsystems. Rather than dumping full file contents into context or assuming perfect tool-call JSON output, it manages context budgets, parses messy model output in multiple formats, decomposes complex tasks into atomic TODO steps, and uses search-and-replace patching instead of full-file rewrites. It targets developers who want a fully local, privacy-preserving coding assistant without requiring cloud API calls or high-end hardware.
Architecture and Key Subsystems
SmallCode's modular architecture spans a fullscreen TUI, a programmatic API, and a rich set of agent-loop components:
- Context Budget Engine — caps tool results at 4k characters, performs mid-turn eviction, and uses semantic compression to summarize history rather than drop it
- 2-Stage Tool Routing — the model first picks a category (read/write/search/run/plan), then receives only the relevant tool schemas, halving schema context overhead
- Forgiving Tool Call Parser — accepts tool calls in JSON, YAML, XML, Hermes format, Liquid AI markers, or plain text, and auto-repairs common mistakes like wrong parameter names or type mismatches
- Patch-First Editing — uses search-and-replace as the primary edit primitive, since small models frequently truncate or hallucinate when asked to reproduce entire files
- TODO-Driven Planning — decomposes complex tasks into atomic steps stored in a TODO file, with lint/compile validation before each step advances
- MarrowScript Cognition Layer — a declarative prompt-compilation system where a 50-line
.marrowfile generates over 1,400 lines of TypeScript with caching, retry, validation, traces, and token budget enforcement - Working Memory — a persistent scratchpad that survives across turns, compensating for limited reasoning depth in small models
- Persistent Shell Sessions — bash calls share a long-lived shell process so
cd, environment variables, and shell state persist across tool calls
Reliability Guards and Quality Features
The agent ships with a dense set of failure-mode mitigations specifically tuned for small model behavior:
- Early-Stop Detection — catches repetition loops, patch spirals on corrupted files, and greeting regression (model losing task context)
- Quality Monitor — detects empty turns, blank tool names, hallucinated tool names, and exact-repeat tool calls, injecting steering corrections capped at 2 consecutive interventions
- Read-Before-Write Guard — refuses the first write to an existing unread file, prompting the model to read first
- Tool-Call Deduplication — short-circuits identical read-only tool calls within a sliding window using cached results
- Adaptive Retry Temperature — varies temperature across retry attempts so the model doesn't produce the same broken output repeatedly
- Per-Tool Trust Score Decay — soft-demotes tools that fail 3+ times and drops them from the schema after 5+ failures in a session
- Snapshot & Auto-Rollback — checkpoints file state before each turn and can automatically revert all edits if validation hard-fails
Deployment Model and Setup
SmallCode installs globally via npm (npm install -g smallcode) or runs directly with npx smallcode. Prebuilt binaries for Windows, macOS, and Linux bundle Node.js and all native addons, eliminating the need for node-gyp or C++ build tools. Configuration is handled via a .env file or smallcode.toml, pointing to any OpenAI-compatible local endpoint such as LM Studio, Ollama, or llama.cpp. Optional cloud escalation to Claude, OpenAI, or DeepSeek is fully opt-in and requires an API key. The RAG indexer requires Python 3 and Git. The tool also exposes a programmatic API (require('smallcode')) for use in CI pipelines or custom tooling.
Update: v1.5.2
The latest release is v1.5.2, published on May 30, 2026 — the same day as the last repository push. Recent additions visible in the README include the Adaptive Model Router (tracks per-model failure rates and auto-routes to medium or strong fallback models), the Contract/Definition-of-Done system (declarative testable assertions the agent must satisfy before reporting completion), the Evidence Store (captures what was tried and what worked across sessions), Plan-Then-Execute mode, Semantic Merge recovery for failed patches, and the Benchmark Harness with a diff tool for CI-integrated regression detection. The project's direction signal is toward making small-model agents more reliable and measurable rather than simply more capable.
Community Discussions
Be the first to start a conversation about SmallCode
Share your experience with SmallCode, ask questions, or help others learn from your insights.
Pricing
Open Source
Fully free and open-source under the MIT License. Install via npm or prebuilt binaries.
- Full terminal-native coding agent
- All 35+ features and subsystems
- Prebuilt binaries for Windows, macOS, Linux
- npm global install or npx
- Programmatic API
Capabilities
Key Features
- Terminal-native fullscreen TUI and classic readline fallback
- Context budget engine with mid-turn eviction and semantic compression
- 2-stage tool routing to halve schema context overhead
- Forgiving multi-format tool call parser (JSON, YAML, XML, Hermes, plain text)
- Patch-first search-and-replace editing primitive
- TODO-driven task decomposition with per-step validation
- MarrowScript cognition layer with prompt caching and token budget enforcement
- Working memory persistent scratchpad across turns
- Persistent shell sessions with optional cwd containment
- Early-stop detection for repetition loops and patch spirals
- Quality monitor for structural failure modes
- Read-before-write guard
- Tool-call deduplication with cached results
- Adaptive retry temperature across failure attempts
- Per-tool trust score decay and schema demotion
- Snapshot and auto-rollback on validation failure
- Model escalation to Claude, OpenAI, or DeepSeek on hard fail
- RAG harness with local GitHub corpus indexing
- Benchmark harness with smoke, polyglot, and tool-use suites
- Benchmark diff tool for CI regression detection
- Contract/Definition-of-Done system with testable assertions
- Evidence store for cross-session learning
- Plan-then-execute mode for multi-step tasks
- Semantic merge recovery for failed patches
- Adaptive model router based on per-model failure rates
- Bootstrap detection for project type and test runner
- Knowledge injection from local reference notes
- Plugin system with lifecycle hooks and provider registry
- Skill system with bundled dev-methodology skills
- Programmatic API for CI and custom tooling
- Web browsing via Playwright with stealth mode (opt-in)
- BoneScript integration for Node.js/TypeScript backend scaffolding
- Thinking budget control for reasoning models
- Interactive provider wizard (/provider)
- Observability: token monitor, context budget bar, execution traces
