# DuckDB

> DuckDB is a high-performance, open-source analytical in-process SQL database that runs everywhere — from CLI and Python to WebAssembly and client-server mode.

DuckDB is an open-source, high-performance analytical SQL database system developed by the DuckDB Foundation and maintained by DuckLabs. It is designed to run in-process alongside application code, eliminating the need for a separate database server while still supporting a rich SQL dialect. The project is released under the MIT license and is available on GitHub with over 38,000 stars as of mid-2026.

## What It Is

DuckDB is an OLAP (Online Analytical Processing) database engine built for fast analytical queries on local and remote data. Unlike traditional client-server databases, DuckDB runs embedded inside the host process — similar to SQLite, but optimized for analytical workloads rather than transactional ones. It uses a columnar storage engine that can spill to disk, enabling queries on datasets larger than available RAM. DuckDB supports a rich SQL dialect including window functions, nested correlated subqueries, complex types (arrays, structs, maps), and collations.

## Ecosystem and Integrations

DuckDB ships native client APIs for a wide range of languages and environments:

- **Languages**: Python, R, Java (JDBC), Go, Rust, Node.js, C, C++, ODBC
- **Formats**: Parquet, CSV, JSON, Apache Iceberg, Delta Lake, Arrow, Avro, Lance, Vortex, DuckLake
- **Cloud storage**: AWS S3, Azure Blob Storage, Google Cloud Storage, Cloudflare R2, Hugging Face datasets
- **Databases**: PostgreSQL, MySQL, SQLite, MotherDuck
- **Data science tools**: Pandas, dplyr (via duckplyr), Jupyter, Marimo

The extension mechanism allows adding new capabilities — many core features such as spatial queries, Iceberg support, and cloud storage access are themselves implemented as extensions.

## Deployment Model

DuckDB is primarily an in-process database: it runs inside the calling application with no separate server process required. Installation takes seconds via package managers (`pip install duckdb`, `npm install @duckdb/node-api`, `cargo add duckdb`, or a shell installer). It runs on Linux, macOS, Windows, and WebAssembly (in-browser). A persistent database is stored as a single file, or DuckDB can operate entirely in memory.

## Update: Quack Remote Protocol and v1.5.3

The latest stable release is **v1.5.3** (published May 20, 2026), described as a bugfix release. Alongside this, DuckDB introduced the **Quack remote protocol** (currently in beta), which turns DuckDB into a client-server database accessible via `ATTACH 'quack:...' AS db` syntax. This marks a significant architectural expansion beyond the traditional in-process model. Recent blog posts cover new DuckDB-Iceberg features in v1.5.3 and support for the Lance lakehouse format, signaling active development on open table format integrations.

## Why It Matters for Data Teams

DuckDB fills a gap between heavyweight distributed query engines (like Spark or BigQuery) and lightweight transactional databases (like SQLite). It enables data engineers and analysts to run fast analytical SQL directly on files, dataframes, or cloud storage without spinning up infrastructure. The DuckDB homepage describes it as "the friendliest analytical database, loved by data teams worldwide" — a vendor claim — and the project's MIT license means it can be embedded freely in commercial products. The DuckDB Foundation hosts community events including DuckCon, with DuckCon #7 scheduled for Amsterdam in June 2026.

## Features
- In-process analytical SQL database
- Columnar storage engine with disk spill support
- Rich SQL dialect with window functions, nested subqueries, complex types
- Native clients for Python, R, Java, Go, Rust, Node.js, C, C++, ODBC
- Direct querying of Parquet, CSV, JSON, and remote files
- Apache Iceberg, Delta Lake, Arrow, Avro, Lance, DuckLake format support
- Cloud storage integration (S3, Azure, GCS, Cloudflare R2)
- WebAssembly (Wasm) support for in-browser execution
- Extension mechanism for adding new features
- Spatial extension for geospatial queries
- Quack remote protocol for client-server mode (beta)
- Pandas and dplyr integration
- In-memory and persistent (single-file) database modes
- Custom user-defined functions (UDFs)

## Integrations
Python (pandas, Jupyter, Marimo), R (dplyr, duckplyr), Java (JDBC), Node.js, Go, Rust, PostgreSQL, MySQL, SQLite, MotherDuck, AWS S3, Microsoft Azure Blob Storage, Google Cloud Storage, Cloudflare R2, Hugging Face, Apache Iceberg, Delta Lake, Apache Arrow, Apache Avro, Lance, DuckLake, Vortex, ODBC, Claude AI

## Platforms
WINDOWS, MACOS, LINUX, CLI, API, DEVELOPER_SDK, WEB

## Pricing
Open Source

## Version
v1.5.3

## Links
- Website: https://duckdb.org
- Documentation: https://duckdb.org/docs/current/
- Repository: https://github.com/duckdb/duckdb
- EveryDev.ai: https://www.everydev.ai/tools/duckdb
