# Nucleus

> Extremely lightweight, security-hardened, declarative container runtime for Linux, designed for AI agent sandboxes and production NixOS services.

Nucleus is a minimalist container runtime for Linux, written in Rust and published under the MIT/Apache-2.0 dual license by the sig-id organization on GitHub. It provides isolated execution environments using Linux kernel primitives directly — namespaces, cgroups v2, pivot_root, seccomp, Landlock, and capabilities — without the overhead of traditional container runtimes like Docker. The project targets two primary workloads: ephemeral AI agent sandboxes that need fast cold-start isolation, and long-running NixOS production services that require fully declarative, reproducible deployments.

## What It Is

Nucleus is a single-binary container runtime that sits closer in spirit to `runc` or gVisor than to Docker. It drops the image-and-distribution half of the container ecosystem — there are no images, no Dockerfile, no registry pull — in exchange for deeper isolation, auditable security policy, and first-class Nix integration. The runtime supports three operating modes: **agent mode** (default, ephemeral fast-startup sandboxes), **strict agent mode** (fail-closed isolation for ephemeral workloads), and **production mode** (strict isolation for long-running NixOS services with declarative configuration, reproducible Nix-built root filesystems, egress policy enforcement, health checks, and systemd integration).

## Architecture and Isolation Primitives

Nucleus uses Linux kernel isolation directly rather than through a daemon or socket API:

- **Namespaces** — PID, mount, network, UTS, IPC, user, cgroup, and optional time isolation
- **cgroups v2** — CPU, memory, PIDs, and I/O resource limits
- **pivot_root** — Filesystem isolation (chroot fallback available only in agent mode)
- **Capabilities** — All capabilities dropped by default; configurable via TOML policy files
- **seccomp** — Syscall allowlist filtering with per-service JSON profiles and trace-based generation
- **Landlock** — Path-based filesystem access control via TOML policy or hardcoded defaults (Linux 5.13+)
- **gVisor** — Optional application kernel (`runsc`) with `none`, `bridge`, and `gvisor-host` network modes
- **OCI bundle generation** — Emits OCI `config.json` and bundle layout for gVisor, including process identity, mounts, namespaces, seccomp, hooks, and cgroup path wiring

Container filesystems are backed by tmpfs and either populated with context files (agent mode) or mounted from a pre-built Nix rootfs closure (production mode).

## Nix-Native Deployment Model

Production deployments in Nucleus are built around a fully declarative model: Nix builds the root filesystem, the NixOS module declares the service, and Nucleus mounts a pinned, reproducible closure at runtime. Key integration points include:

- `nucleus.lib.mkRootfs` — builds a minimal, reproducible root filesystem from specified Nix packages
- `nucleus.lib.mkAgentToolchainRootfs` — layers a broad agent development toolchain for provider CLIs (claude, codex, gemini)
- First-class NixOS module — each container becomes a `nucleus-<name>.service` systemd unit with journald logging, sd_notify readiness, and automatic restart
- Flake-based packaging with pinned inputs and rootfs attestation for auditable runtime state

## Performance Benchmarks

The repository publishes benchmark results comparing Nucleus against Docker and bare metal. According to the project's own benchmarks on Linux 6.18 x86_64:

- **Cold start**: Nucleus reports ~12 ms vs Docker's ~500 ms
- **PostgreSQL 18 pgbench (SELECT-only)**: Nucleus with io_uring reports ~107,039 TPS vs bare metal's ~84,895 TPS
- **PostgreSQL 18 pgbench (TPC-B mixed)**: Nucleus with worker reports ~1,757 TPS vs bare metal's ~1,490 TPS

The project notes that occasional wins over bare metal in the PostgreSQL benchmarks should be treated as benchmark noise rather than a guaranteed speedup.

## Security Policy and Audit Controls

Nucleus externalizes security policy from the application build, keeping it auditable by security engineers independently of application rebuilds:

- Per-service seccomp profiles (OCI JSON format) with SHA-256 pinning
- Capability bounding set policies (TOML)
- Landlock filesystem access rules (TOML)
- Seccomp trace mode records actual syscall usage; `nucleus seccomp generate` creates a minimal allowlist profile
- Structured audit log, machine-readable lifecycle event streams (JSON Lines), context hashing, rootfs attestation, seccomp deny logging, mount flag verification, and kernel lockdown assertions
- Optional OpenTelemetry export for container lifecycle tracing via `NUCLEUS_OTLP_ENDPOINT`

## Update: v0.3.3

The latest release is v0.3.3, published on April 8, 2026. The repository was last pushed on June 9, 2026, indicating active development. Recent features documented in the README include privilege drop for services (`--user`, `--group`, `--additional-group`), ownership-aware secrets and writable paths, OCI bundle identity support with `process.user` and supplementary groups, probe execution under workload identity, and systemd/NixOS service integration improvements including `user`, `group`, and `supplementaryGroups` exposure in the NixOS module. The project uses TLA+ formal specifications verified with the Apalache model checker across subsystems.

## Features
- Agent mode for ephemeral AI agent sandboxes with fast cold-start (~12ms)
- Strict agent mode with fail-closed isolation
- Production mode for long-running NixOS services
- Nix-native integration with mkRootfs and mkAgentToolchainRootfs helpers
- First-class NixOS module generating systemd service units
- gVisor integration as optional application kernel
- OCI bundle generation for gVisor (config.json, process identity, mounts, seccomp)
- cgroups v2 resource limits (CPU, memory, PIDs, I/O)
- Landlock LSM path-based filesystem access control
- Per-service seccomp profiles with SHA-256 pinning and trace-based generation
- TOML capability bounding set policies
- Egress policy enforcement with deny-by-default outbound rules
- Detached mode via systemd transient services
- Multi-container topology (Compose-equivalent TOML DAG)
- In-memory secrets via dedicated tmpfs at /run/secrets
- Rootfs attestation with .nucleus-rootfs-sha256 manifest
- Structured audit log and machine-readable lifecycle event streams (JSON Lines)
- OpenTelemetry export for container lifecycle tracing
- CRIU checkpoint and restore support
- Seccomp trace mode for syscall recording and profile generation
- Workspace mount modes: bind-rw, bind-ro, copy-in-out
- Programmatic launch config via TOML or JSON
- TLA+ formal specifications verified with Apalache model checker
- Kernel lockdown assertion support
- Time namespace isolation
- PTY/console socket support following OCI convention

## Integrations
NixOS, systemd, gVisor (runsc), CRIU (checkpoint/restore), OpenTelemetry (OTLP), slirp4netns (userspace NAT), iptables, journald, systemd-creds (LoadCredential/LoadCredentialEncrypted), Apalache (TLA+ model checker), Cargo/crates.io

## Platforms
WINDOWS, MACOS, LINUX, API, CLI

## Pricing
Open Source

## Version
v0.3.3

## Links
- Website: https://github.com/sig-id/nucleus
- Documentation: https://github.com/sig-id/nucleus#readme
- Repository: https://github.com/sig-id/nucleus
- EveryDev.ai: https://www.everydev.ai/tools/nucleus-container-runtime
