# Gemma Gem

> A browser extension that runs Google's Gemma 4 AI model entirely on-device via WebGPU, enabling page reading, form filling, and JavaScript execution with no API keys or cloud dependency.

Gemma Gem is an open-source Chrome extension that brings a personal AI assistant directly into your browser, powered by Google's Gemma 4 model running entirely on-device via WebGPU. There are no API keys, no cloud calls, and no data leaving your machine — all inference happens locally using your GPU. The extension injects a chat overlay into any page, allowing you to ask questions about content or instruct the agent to interact with the page on your behalf. It supports two model sizes (E2B ~500MB and E4B ~1.5GB), both cached locally after the first download.

- **On-device inference** — *Uses `@huggingface/transformers` with WebGPU for fully local Gemma 4 model execution; no internet connection required after model download.*
- **Page reading** — *The `read_page_content` tool extracts text or HTML from the full page or a specific CSS selector, letting the model understand any site you visit.*
- **Element interaction** — *`click_element` and `type_text` tools allow the agent to click buttons and fill forms by CSS selector, enabling true browser automation.*
- **Screenshot capture** — *`take_screenshot` captures the visible page as a PNG, giving the model visual context alongside text.*
- **JavaScript execution** — *`run_javascript` runs arbitrary JS in the page context with full DOM access via the service worker.*
- **Scrolling** — *`scroll_page` scrolls up or down by a specified pixel amount for navigating long pages.*
- **Model switching** — *Toggle between Gemma 4 E2B and E4B models in settings; selection persists across sessions.*
- **Native thinking toggle** — *Enable or disable Gemma 4's native thinking mode directly from the chat settings panel.*
- **Per-site disable** — *Suppress the extension on specific hostnames, persisted across browser sessions.*
- **Zero-dependency agent core** — *The `agent/` directory defines clean `ModelBackend` and `ToolExecutor` interfaces with no external dependencies, making it extractable as a standalone library.*
- **Developer-friendly build** — *Built with WXT (Vite-based Chrome extension framework); supports development builds with full logging and production builds with minification.*

## Features
- On-device Gemma 4 inference via WebGPU
- No API keys or cloud required
- Read page content by CSS selector
- Click elements by CSS selector
- Type text into inputs
- Scroll page up/down
- Take screenshots of visible page
- Run JavaScript in page context
- Switch between E2B (~500MB) and E4B (~1.5GB) models
- Native Gemma 4 thinking mode toggle
- Max iterations cap for tool call loops
- Clear conversation context per page
- Per-hostname disable/enable
- Shadow DOM chat overlay injected into pages
- 128K context window
- q4f16 quantization

## Integrations
Google Gemma 4, HuggingFace Transformers.js, WebGPU, WXT (Chrome extension framework), marked (Markdown rendering), ONNX runtime

## Platforms
API, BROWSER_EXTENSION

## Pricing
Open Source

## Links
- Website: https://github.com/kessler/gemma-gem
- Documentation: https://github.com/kessler/gemma-gem#readme
- Repository: https://github.com/kessler/gemma-gem
- EveryDev.ai: https://www.everydev.ai/tools/gemma-gem