SmallCode

Name: SmallCode
Availability: OnlineOnly
Author: Doorman11991

A terminal-native AI coding agent optimized for small local LLMs (8B-35B parameters), with intelligent context management, forgiving tool parsing, and patch-first editing.

Visit Website

At a Glance

Pricing

Open Source

Fully free and open-source under the MIT License. Install via npm or prebuilt binaries.

Engagement

Available On

CLI

Windows

macOS

Linux

API

Doorman11991Doorman11991 builds SmallCode, an open-source terminal-nativ…

Listed May 2026

About SmallCode

SmallCode is an open-source, terminal-native coding agent built specifically to extract useful work from local language models in the 8B–35B parameter range running on consumer hardware. It was created by GitHub user Doorman11991, published under the MIT License, and has accumulated over 1,600 stars since its initial release in May 2026. The project's core premise is that frontier-model tools like OpenCode assume capabilities — massive context windows, reliable JSON tool calling, and high reasoning depth — that small local models simply don't have.

What It Is

SmallCode is a CLI-based coding agent that compensates for the architectural limitations of small LLMs through a layered set of intelligent subsystems. Rather than dumping full file contents into context or assuming perfect tool-call JSON output, it manages context budgets, parses messy model output in multiple formats, decomposes complex tasks into atomic TODO steps, and uses search-and-replace patching instead of full-file rewrites. It targets developers who want a fully local, privacy-preserving coding assistant without requiring cloud API calls or high-end hardware.

Architecture and Key Subsystems

SmallCode's modular architecture spans a fullscreen TUI, a programmatic API, and a rich set of agent-loop components:

Context Budget Engine — caps tool results at 4k characters, performs mid-turn eviction, and uses semantic compression to summarize history rather than drop it
2-Stage Tool Routing — the model first picks a category (read/write/search/run/plan), then receives only the relevant tool schemas, halving schema context overhead
Forgiving Tool Call Parser — accepts tool calls in JSON, YAML, XML, Hermes format, Liquid AI markers, or plain text, and auto-repairs common mistakes like wrong parameter names or type mismatches
Patch-First Editing — uses search-and-replace as the primary edit primitive, since small models frequently truncate or hallucinate when asked to reproduce entire files
TODO-Driven Planning — decomposes complex tasks into atomic steps stored in a TODO file, with lint/compile validation before each step advances
MarrowScript Cognition Layer — a declarative prompt-compilation system where a 50-line .marrow file generates over 1,400 lines of TypeScript with caching, retry, validation, traces, and token budget enforcement
Working Memory — a persistent scratchpad that survives across turns, compensating for limited reasoning depth in small models
Persistent Shell Sessions — bash calls share a long-lived shell process so cd, environment variables, and shell state persist across tool calls

Reliability Guards and Quality Features

The agent ships with a dense set of failure-mode mitigations specifically tuned for small model behavior:

Early-Stop Detection — catches repetition loops, patch spirals on corrupted files, and greeting regression (model losing task context)
Quality Monitor — detects empty turns, blank tool names, hallucinated tool names, and exact-repeat tool calls, injecting steering corrections capped at 2 consecutive interventions
Read-Before-Write Guard — refuses the first write to an existing unread file, prompting the model to read first
Tool-Call Deduplication — short-circuits identical read-only tool calls within a sliding window using cached results
Adaptive Retry Temperature — varies temperature across retry attempts so the model doesn't produce the same broken output repeatedly
Per-Tool Trust Score Decay — soft-demotes tools that fail 3+ times and drops them from the schema after 5+ failures in a session
Snapshot & Auto-Rollback — checkpoints file state before each turn and can automatically revert all edits if validation hard-fails

Deployment Model and Setup

SmallCode installs globally via npm (npm install -g smallcode) or runs directly with npx smallcode. Prebuilt binaries for Windows, macOS, and Linux bundle Node.js and all native addons, eliminating the need for node-gyp or C++ build tools. Configuration is handled via a .env file or smallcode.toml, pointing to any OpenAI-compatible local endpoint such as LM Studio, Ollama, or llama.cpp. Optional cloud escalation to Claude, OpenAI, or DeepSeek is fully opt-in and requires an API key. The RAG indexer requires Python 3 and Git. The tool also exposes a programmatic API (require('smallcode')) for use in CI pipelines or custom tooling.

Update: v1.5.2

The latest release is v1.5.2, published on May 30, 2026 — the same day as the last repository push. Recent additions visible in the README include the Adaptive Model Router (tracks per-model failure rates and auto-routes to medium or strong fallback models), the Contract/Definition-of-Done system (declarative testable assertions the agent must satisfy before reporting completion), the Evidence Store (captures what was tried and what worked across sessions), Plan-Then-Execute mode, Semantic Merge recovery for failed patches, and the Benchmark Harness with a diff tool for CI-integrated regression detection. The project's direction signal is toward making small-model agents more reliable and measurable rather than simply more capable.

Community Discussions

Be the first to start a conversation about SmallCode

Share your experience with SmallCode, ask questions, or help others learn from your insights.

Pricing

OPEN SOURCE

Open Source

Fully free and open-source under the MIT License. Install via npm or prebuilt binaries.

Full terminal-native coding agent
All 35+ features and subsystems
Prebuilt binaries for Windows, macOS, Linux
npm global install or npx
Programmatic API

Capabilities

Key Features

Terminal-native fullscreen TUI and classic readline fallback
Context budget engine with mid-turn eviction and semantic compression
2-stage tool routing to halve schema context overhead
Forgiving multi-format tool call parser (JSON, YAML, XML, Hermes, plain text)
Patch-first search-and-replace editing primitive
TODO-driven task decomposition with per-step validation
MarrowScript cognition layer with prompt caching and token budget enforcement
Working memory persistent scratchpad across turns
Persistent shell sessions with optional cwd containment
Early-stop detection for repetition loops and patch spirals
Quality monitor for structural failure modes
Read-before-write guard
Tool-call deduplication with cached results
Adaptive retry temperature across failure attempts
Per-tool trust score decay and schema demotion
Snapshot and auto-rollback on validation failure
Model escalation to Claude, OpenAI, or DeepSeek on hard fail
RAG harness with local GitHub corpus indexing
Benchmark harness with smoke, polyglot, and tool-use suites
Benchmark diff tool for CI regression detection
Contract/Definition-of-Done system with testable assertions
Evidence store for cross-session learning
Plan-then-execute mode for multi-step tasks
Semantic merge recovery for failed patches
Adaptive model router based on per-model failure rates
Bootstrap detection for project type and test runner
Knowledge injection from local reference notes
Plugin system with lifecycle hooks and provider registry
Skill system with bundled dev-methodology skills
Programmatic API for CI and custom tooling
Web browsing via Playwright with stealth mode (opt-in)
BoneScript integration for Node.js/TypeScript backend scaffolding
Thinking budget control for reasoning models
Interactive provider wizard (/provider)
Observability: token monitor, context budget bar, execution traces

Integrations

LM Studio

Ollama

llama.cpp

OpenAI API

Anthropic Claude

DeepSeek

OpenRouter

DuckDuckGo (web search)

Playwright (web browsing)

ripgrep (search)

SQLite / better-sqlite3 (memory)

BoneScript

budget-aware-mcp

MarrowScript

npm

npx

API Available

View Docs

Back to all tools Suggest an edit