# Bumblebee

> Read-only inventory collector for package, extension, and developer-tool metadata on macOS and Linux endpoints, built for fast supply-chain exposure checks.

Bumblebee is an open-source, read-only inventory collector built by Perplexity AI and published under the Apache License 2.0. It targets macOS and Linux developer endpoints and answers a specific supply-chain response question: when an advisory names a package, extension, or version, which developer machines show a match in their on-disk metadata right now? The project is written in Go, ships as a single static binary with zero non-stdlib dependencies, and was first released in May 2026.

## What It Is

Bumblebee sits in the gap between SBOMs (what shipped) and EDR tools (what ran or touched the network). It focuses on the messy local state that neither category covers well: lockfiles, package-manager install metadata, extension manifests, and developer-tool configs scattered across developer workstations. It reads that on-disk state, converts it into structured NDJSON component records, and — when given an exposure catalog — flags exact matches so incident responders can quickly identify affected machines without executing package managers or reading source files.

## Architecture and Scope

The tool is deliberately narrow and read-only:

- **Single static binary** compiled with Go 1.25+, zero non-stdlib dependencies.
- **Three scan profiles** — `baseline`, `project`, and `deep` — for different populations and cadences.
- **No package manager execution**: it never runs `npm ls`, `pip show`, `go list`, or similar commands.
- **MCP config safety**: parses MCP host configs for server inventory but explicitly does not emit environment values or credentials found in `env` blocks.

Supported ecosystems include npm (via package-lock, pnpm-lock, yarn.lock, bun.lock), PyPI (dist-info/METADATA), Go modules (go.sum/go.mod), RubyGems (Gemfile.lock), Composer (composer.lock), MCP server configs, VS Code/Cursor/Windsurf/VSCodium editor extensions, and Chromium/Firefox browser extensions.

## Output Model

Every scan emits NDJSON records — one per line — with a `scan_summary` record at the end. Package records carry a `confidence` field (`high`, `medium`, or `low`) reflecting how reliably identity and version were established. Finding records are emitted when a package matches an entry in a supplied exposure catalog, including fields for severity, catalog ID, matched version, and source file. Record IDs are content-addressed hashes of a canonical identity tuple, making them stable across runs for deduplication on the receiver side.

## Exposure Catalog Format

Bumblebee uses a minimal JSON catalog format for exposure matching — exact `(ecosystem, name, version)` tuples only. The repository ships a `threat_intel/` directory containing maintained exposure catalogs built from public threat-intelligence reporting on recent supply-chain campaigns. According to the repository README, these catalogs are assembled with Perplexity Computer and updated via pull requests as new campaigns are reported.

## Update: v0.1.1

The latest release is **v0.1.1**, published on 2026-05-22, just two days after the repository was created on 2026-05-20. The repository had accumulated 165 stars and 8 forks within days of launch, and the project's GitHub topics — `golang`, `package-inventory`, `supply-chain-security` — reflect its focused positioning. The `selftest` subcommand provides a built-in end-to-end smoke test against embedded fixtures, useful for validating fleet rollouts without network calls.

## Features
- Read-only on-disk inventory collection (no package manager execution)
- Three scan profiles: baseline, project, deep
- NDJSON structured output with per-record confidence levels
- Exposure catalog matching for exact (ecosystem, name, version) lookups
- Single static binary, zero non-stdlib dependencies
- Supports npm, pnpm, Yarn, Bun, PyPI, Go modules, RubyGems, Composer, MCP, editor extensions, browser extensions
- MCP host config parsing without emitting credentials or env values
- Content-addressed record IDs stable across runs for deduplication
- Built-in selftest subcommand with embedded fixtures
- HTTPS and file transport output options
- Maintained threat_intel/ exposure catalogs from public supply-chain reporting
- Version stamping via ldflags for traceable production builds

## Integrations
VS Code, Cursor, Windsurf, VSCodium, Chromium-family browsers, Firefox, Claude Desktop (MCP config), Gemini CLI / Code Assist (MCP config), Cline (MCP config), npm, pnpm, Yarn, Bun, PyPI, Go modules, RubyGems, Composer

## Platforms
MACOS, LINUX, VSC_EXTENSION, CLI

## Pricing
Open Source

## Version
v0.1.1

## Links
- Website: https://github.com/perplexityai/bumblebee
- Documentation: https://github.com/perplexityai/bumblebee/blob/main/docs/inventory-sources.md
- Repository: https://github.com/perplexityai/bumblebee
- EveryDev.ai: https://www.everydev.ai/tools/bumblebee-supply-chain-scanner
