# AgentQL

> AgentQL connects LLMs and AI agents to the web using a natural language query language, Python/JavaScript SDKs, REST API, and browser debugger for data extraction and automation.

AgentQL, built by TinyFish, is a suite of tools that connects LLMs and AI agents to the entire web through an AI-powered query language and supporting SDKs. It lets developers describe the data they want in natural language rather than writing fragile XPath or CSS selectors, and the system uses AI to locate matching elements on any live web page. The project is open source under the MIT License and is actively maintained on GitHub.

## What It Is

AgentQL is a web data extraction and automation platform centered on a custom query language that uses natural language to pinpoint elements and structured data on any web page — public or private, static or dynamically generated, including pages behind authentication. Instead of brittle DOM selectors that break when a site redesigns, AgentQL's AI-powered selectors analyze page structure semantically and self-heal as UI changes over time. The same query can work across multiple similar sites, making it reusable across data pipelines.

## Core Toolset

AgentQL ships as a multi-component toolkit:

- **Python SDK** — integrates with Playwright for browser-based automation and scraping in Python
- **JavaScript SDK** — the same Playwright integration for Node.js workflows
- **REST API** — executes queries against any public URL without requiring a local browser
- **Debugger Browser Extension** — a Chrome extension for writing and optimizing queries in real time on live pages
- **Playground** — an interactive environment for testing queries and exporting Python scripts
- **MCP server** — integrates with agent frameworks via the Model Context Protocol

## How the Query Language Works

Queries are written in a GraphQL-inspired syntax where field names describe the data in plain English. For example, a query asking for `products[] { product_name product_price(include currency symbol) }` returns a structured JSON array with those fields populated from whatever e-commerce page is loaded. Transforms can be applied inline within queries, and list syntax (`[]`) handles repeated elements automatically. The AI layer maps these natural language field names to actual DOM elements, so developers never need to inspect HTML manually.

## Integration and Automation Fit

AgentQL is designed to slot into existing data and agent workflows rather than replace them. The GitHub README lists integrations with LangChain, Zapier, and an MCP server for agent frameworks. The REST API endpoint enables browserless data retrieval from public URLs, useful for lightweight pipelines that don't need a full headless browser. PDF parsing for tables and other structured documents is also supported. The platform works on any page including those requiring login, handles infinite scroll, popup dismissal, form submission, and paginated data collection — all demonstrated in the project's example library.

## Why It Got Attention

AgentQL was recognized as Product Hunt's #1 Product of the Day and #1 Product of the Week, according to the product's homepage. The GitHub repository, created in February 2024, has accumulated over 1,300 stars and 160 forks. The project's pitch — replacing fragile XPath/CSS selectors with semantic, self-healing natural language queries — addresses a well-known pain point in web scraping and RPA workflows. Developer testimonials on the homepage highlight the value of semantic element grounding for avoiding context window issues and hallucinations when feeding web content to LLMs.

## Open-Source Deployment Model

The core AgentQL repository is published under the MIT License by Tiny Fish, Inc. and is freely available to fork, modify, and distribute. The hosted API and remote browser infrastructure are commercial services with usage-based billing layered on top of the open-source foundation. Developers can run local automation using the SDKs with their own API key, or use the managed remote browser sessions for cloud-scale workflows.

## Features
- AI-powered natural language query language for web data extraction
- Python SDK with Playwright integration
- JavaScript SDK with Playwright integration
- REST API for browserless data retrieval
- Chrome debugger browser extension for real-time query optimization
- Interactive playground with Python script export
- Self-healing selectors resilient to UI changes
- Cross-site query reusability
- Structured JSON output defined by query shape
- Works on authenticated and dynamically generated pages
- PDF parsing for tables and structured documents
- Remote browser sessions for cloud-scale automation
- MCP server for agent framework integration
- Inline data transforms within queries
- Infinite scroll, pagination, and popup handling support

## Integrations
Playwright, LangChain, Zapier, MCP (Model Context Protocol), Google Colab, Headless browsers

## Platforms
WINDOWS, WEB, API, BROWSER_EXTENSION, DEVELOPER_SDK, CLI

## Pricing
Open Source, Free tier available

## Links
- Website: https://www.agentql.com
- Documentation: https://docs.agentql.com
- Repository: https://github.com/tinyfish-io/agentql
- EveryDev.ai: https://www.everydev.ai/tools/agentql
