# Gemma Gem > A browser extension that runs Google's Gemma 4 AI model entirely on-device via WebGPU, enabling page reading, form filling, and JavaScript execution with no API keys or cloud dependency. Gemma Gem is an open-source Chrome extension that brings a personal AI assistant directly into your browser, powered by Google's Gemma 4 model running entirely on-device via WebGPU. There are no API keys, no cloud calls, and no data leaving your machine — all inference happens locally using your GPU. The extension injects a chat overlay into any page, allowing you to ask questions about content or instruct the agent to interact with the page on your behalf. It supports two model sizes (E2B ~500MB and E4B ~1.5GB), both cached locally after the first download. - **On-device inference** — *Uses `@huggingface/transformers` with WebGPU for fully local Gemma 4 model execution; no internet connection required after model download.* - **Page reading** — *The `read_page_content` tool extracts text or HTML from the full page or a specific CSS selector, letting the model understand any site you visit.* - **Element interaction** — *`click_element` and `type_text` tools allow the agent to click buttons and fill forms by CSS selector, enabling true browser automation.* - **Screenshot capture** — *`take_screenshot` captures the visible page as a PNG, giving the model visual context alongside text.* - **JavaScript execution** — *`run_javascript` runs arbitrary JS in the page context with full DOM access via the service worker.* - **Scrolling** — *`scroll_page` scrolls up or down by a specified pixel amount for navigating long pages.* - **Model switching** — *Toggle between Gemma 4 E2B and E4B models in settings; selection persists across sessions.* - **Native thinking toggle** — *Enable or disable Gemma 4's native thinking mode directly from the chat settings panel.* - **Per-site disable** — *Suppress the extension on specific hostnames, persisted across browser sessions.* - **Zero-dependency agent core** — *The `agent/` directory defines clean `ModelBackend` and `ToolExecutor` interfaces with no external dependencies, making it extractable as a standalone library.* - **Developer-friendly build** — *Built with WXT (Vite-based Chrome extension framework); supports development builds with full logging and production builds with minification.* ## Features - On-device Gemma 4 inference via WebGPU - No API keys or cloud required - Read page content by CSS selector - Click elements by CSS selector - Type text into inputs - Scroll page up/down - Take screenshots of visible page - Run JavaScript in page context - Switch between E2B (~500MB) and E4B (~1.5GB) models - Native Gemma 4 thinking mode toggle - Max iterations cap for tool call loops - Clear conversation context per page - Per-hostname disable/enable - Shadow DOM chat overlay injected into pages - 128K context window - q4f16 quantization ## Integrations Google Gemma 4, HuggingFace Transformers.js, WebGPU, WXT (Chrome extension framework), marked (Markdown rendering), ONNX runtime ## Platforms API, BROWSER_EXTENSION ## Pricing Open Source ## Links - Website: https://github.com/kessler/gemma-gem - Documentation: https://github.com/kessler/gemma-gem#readme - Repository: https://github.com/kessler/gemma-gem - EveryDev.ai: https://www.everydev.ai/tools/gemma-gem