SlimSnap

Name: SlimSnap
Availability: OnlineOnly
Author: Alexander Bickov

A macOS app that converts screenshots into structured JSON so terminal-based AI coding agents like Claude Code, Aider, and Codex CLI can read and reason about UI elements.

Visit Website

At a Glance

Pricing

Free

Full app access during launch, no registration required.

Engagement

Available On

Windows

macOS

Linux

Web

API

Alexander BickovRiga, LatviaEst. 2024

Listed Jun 2026

About SlimSnap

SlimSnap is a macOS desktop app built by Alexander Bickov that turns any screenshot into a compact JSON blob — complete with OCR'd text, element bounding boxes, and user annotations — so terminal-based AI coding agents can "see" the UI without accepting image input. The app is free during launch, with a paid tier planned, and the underlying JSON schema is published on GitHub under the MIT license. It targets developers and product people who use CLI agents like Claude Code, Aider, and Codex CLI.

What It Is

SlimSnap bridges the gap between visual UIs and text-only AI agents. Terminal agents can read files, run tests, and write code, but they cannot accept image input — meaning any UI discussion requires writing out a paragraph description of what a screenshot would show instantly. SlimSnap captures a screen region, runs local OCR, extracts element types and bounding boxes, and serializes everything into a structured JSON format that can be pasted anywhere text is accepted: terminals, SSH sessions, CI logs, git commits, and more.

The JSON schema (SlimSnap Schema v1.0) is a formal JSON Schema 2020-12 specification. Each export includes a schema_version, ISO-8601 timestamp, image metadata, a screen context object (window title, app name, URL), an elements array of detected UI primitives with normalized 0–1 bounding boxes, and an annotations array capturing user-drawn arrows, callouts, and highlights with structured intent values.

Token Efficiency

The homepage states that a single screenshot billed through Anthropic's vision API costs approximately 1,568 tokens on Claude Sonnet and Haiku, and up to 4,784 tokens on Opus 4.7 and 4.8. A typical SlimSnap JSON export of the same screen runs 600–800 tokens — roughly 55% fewer tokens per turn on Sonnet and up to 85% fewer on Opus, according to the vendor. The GitHub README claims approximately 12× fewer tokens compared to raw vision input. The reduction compounds across long iterative sessions where the same UI context is referenced repeatedly.

How the Workflow Works

Capture: Press ⌘⇧S, drag to select any screen region, release. Runs natively on macOS with no additional installation.
Annotate: Add arrows, callouts, and highlights to point at specific elements. Annotations are serialized as structured objects with intent fields (highlight, explain, action, question) and optional target_ref IDs linking them to specific elements.
Copy JSON: One click copies the full JSON blob to the clipboard. Paste it into Claude Code, Aider, Codex CLI, Cursor, Continue.dev, or any text input.

A Claude Code skill is also published on GitHub (bickov/slimsnap-skill). It reads a config file at ~/.slimsnap/config.json to find the default save folder, lists the folder, and loads the latest JSON file into the agent's context automatically — no hardcoded paths.

Privacy and Local Processing

The homepage explicitly states that capture and OCR run locally on the Mac. Screenshots never leave the machine, and no account or server is required to use the app. The free tier requires no registration.

Open Schema, Closed App

The JSON schema specification (bickov/slimsnap-schema) is MIT-licensed and open for anyone to read, validate against, or implement independently. The Mac desktop app that produces the JSON is closed-source. The vendor notes that the schema is implementation-agnostic: users can hand-write valid JSON, generate it from another OCR pipeline, or build exporters for Windows or Linux. The app is currently Mac-only, with Windows and Linux support described as dependent on user demand.

Community Discussions

Be the first to start a conversation about SlimSnap

Share your experience with SlimSnap, ask questions, or help others learn from your insights.

Pricing

FREE

Free

Full app access during launch, no registration required.

Screenshot capture with ⌘⇧S
Local OCR
JSON export with bounding boxes and annotations
Claude Code skill
No account required

Capabilities

Key Features

Screenshot capture with ⌘⇧S keyboard shortcut
Local OCR extracts all text labels, buttons, and error messages
Structured JSON export with element bounding boxes in normalized 0–1 coordinates
Annotation tools: arrows, callouts, highlights
Annotations serialized with structured intent values (highlight, explain, action, question)
One-click copy JSON to clipboard
Claude Code skill for automatic JSON ingestion
Deterministic element IDs for agent reference
Estimated token count included in every export
MIT-licensed open JSON schema (SlimSnap Schema v1.0)
No account or registration required
All processing runs locally — no server uploads
Compatible with Claude Code, Aider, Codex CLI, Cursor, Continue.dev

Integrations

Claude Code

Aider

Codex CLI

Cursor

Continue.dev

Anthropic Claude API

API Available

View Docs

Back to all tools Suggest an edit

About SlimSnap

What It Is

Token Efficiency

How the Workflow Works

Capture: Press ⌘⇧S, drag to select any screen region, release. Runs natively on macOS with no additional installation.
Annotate: Add arrows, callouts, and highlights to point at specific elements. Annotations are serialized as structured objects with intent fields (highlight, explain, action, question) and optional target_ref IDs linking them to specific elements.
Copy JSON: One click copies the full JSON blob to the clipboard. Paste it into Claude Code, Aider, Codex CLI, Cursor, Continue.dev, or any text input.

SlimSnap