Airbyte
Open-source data movement platform for ELT pipelines and AI agents, connecting 600+ sources to warehouses, lakes, and AI applications via MCP, SDK, and CLI.
At a Glance
Free self-hosted or cloud plan with community support and basic connector access.
Engagement
Available On
Listed Jun 2026
About Airbyte
Airbyte is an open-source data movement platform that has been moving production data for thousands of companies since 2020. It provides a catalog of 600+ connectors for APIs, databases, data warehouses, data lakes, and AI applications, and is available both as a self-hosted deployment and as a managed cloud service. The project is hosted on GitHub under a dual MIT/ELv2 license and has accumulated over 21,000 stars.
What It Is
Airbyte covers two primary use cases: ELT/ETL data pipelines that move data into warehouses and lakes, and a newer "data and action layer" for AI agents that gives LLMs and agent frameworks real-time read/write access to business data. The core open-source platform handles data replication, while the Airbyte Agents product (including a Context Store, MCP server, and Python SDK) extends that infrastructure to serve agentic workflows. Both paths share the same underlying connector infrastructure.
Architecture: Two Products, One Foundation
Airbyte's platform is organized around two distinct but related products:
- Data Replication (ELT): The original open-source core. Supports 600+ connectors, Change Data Capture (CDC), schema propagation, column selection, cron scheduling, and a no-code Connector Builder. Can be self-hosted or used via Airbyte Cloud.
- Airbyte Agents: A managed context layer for AI agents. Includes a Context Store (a live, searchable index of connected business data), an MCP server for Claude/ChatGPT/Cursor, a Python Agent SDK, and an Automation Builder UI. The Agent SDK supports pydantic-ai, LangChain, OpenAI Agents, CrewAI, LlamaIndex, AutoGen, and FastMCP.
The homepage states that the same replication infrastructure powering data pipelines now powers every agent built on the platform.
How the Agent Layer Works
Airbyte Agents introduces a "Connect, Ask, Act" model:
- Connect: Authenticate once with managed OAuth/token handling across 50+ agent connectors.
- Ask: Query the Context Store across all connected systems with a single call, returning cross-system context (e.g., a customer record unified from Salesforce, Zendesk, and Stripe).
- Act: Write back to systems of record — update CRM fields, create tickets, post messages — through the same SDK.
The vendor publishes open-source benchmarks claiming the Airbyte MCP uses 80% fewer tokens on a single query, makes 40% fewer tool calls compared with native vendor MCPs, and achieves 90% cost savings on multi-source queries versus custom connectors.
Open-Source Lineage and License
The core repository (airbytehq/airbyte) was created in July 2020 and is licensed under a combination of MIT and Elastic License 2.0 (ELv2). The ELv2 license prohibits offering the software as a hosted or managed service to third parties, which means the open-source version is free to self-host but cannot be resold as a managed service. The Agent SDK (airbytehq/airbyte-agent-sdk) is a separate repository available via uv pip install airbyte-agent-sdk.
Update: Airbyte 2.0
The latest GitHub release is v2.0.0 (Airbyte 2.0), published on October 15, 2025. The repository remains actively maintained, with the last push recorded in June 2026. The Airbyte Agents product was announced as a new addition, with the homepage calling it "New: Airbyte Agents. Context-aware AI, built on your data." The product direction signal is a clear pivot toward serving AI agent infrastructure alongside the established ELT pipeline use case.
Adoption and Scale Claims
According to vendor-published figures on the homepage and pricing page: Airbyte claims 20% of the Fortune 500 uses Airbyte, 1.2 million pipelines are synced daily, 7,000 companies use Airbyte, the community has 27,000 members, and $181 million has been raised from investors. The GitHub repository shows 21,499 stars and 5,231 forks as of the last update. These figures are vendor-published and have not been independently verified.
Community Discussions
Be the first to start a conversation about Airbyte
Share your experience with Airbyte, ask questions, or help others learn from your insights.
Pricing
Core
Free self-hosted or cloud plan with community support and basic connector access.
- 600+ connectors
- Connector Builder
- Airbyte API
- Terraform Provider
- PyAirbyte
Standard
Volume-based cloud plan with faster sync frequency and support portal access.
- 600+ connectors
- 15-min max sync frequency
- Cron scheduling
- Schema propagation
- Column selection
- Change Data Capture
- Airbyte Support Portal
- Multitenancy
- Multiple workspaces
Plus
Annual plan with Standard features, accelerated support, and bulk-credit discounts.
- Everything in Standard
- Accelerated Support
- Annual billing
- Bulk-credit discounts
Pro
Capacity-based plan (Data Workers) for production workloads with premium features and support.
- Everything in Standard/Plus
- <5 min max sync frequency
- Role-Based Access Control
- Row filtering
- Field hashing & encryption
- Multiple data regions
- AWS PrivateLink (add-on)
- Premium Support
- Priority Support (add-on)
- User groups & SCIM
- OpenTelemetry metrics
- External secret references
Enterprise Flex
New enterprise plan with full platform features and business critical support.
- Everything in Pro
- Business Critical Support (add-on)
- Enterprise connectors (add-on)
Capabilities
Key Features
- 600+ pre-built connectors for APIs, databases, warehouses, and SaaS tools
- Change Data Capture (CDC) support
- Schema propagation and column selection
- No-code Connector Builder and low-code CDK
- Airbyte Agents Context Store for cross-system AI queries
- MCP server for Claude, ChatGPT, and Cursor
- Python Agent SDK (airbyte-agent-sdk)
- Automation Builder UI for no-code agent workflows
- Managed OAuth and token refresh for 50+ agent connectors
- Read and write (Act) capabilities through the Agent SDK
- Self-hosted and cloud deployment options
- Terraform provider and PyAirbyte for infrastructure-as-code
- Orchestration integrations with Airflow, Dagster, Prefect, and Kestra
- Role-Based Access Control (RBAC)
- Field hashing and encryption
- Row filtering
- Multiple data regions support
- SOC 2 Type II, GDPR, and HIPAA compliance
- Single Sign-On (SSO)
- OpenTelemetry metrics support
