Delta Lake (Linux Foundation Projects)
Delta Lake is an open-source storage framework that enables building a format-agnostic Lakehouse architecture, unifying ETL, data warehousing, and machine learning workloads.
At a Glance
- Enterprise Data Engineering
- Data Science
- Cloud Data Platforms
- FinTech
- +1 more
AI Tools by Delta Lake (Linux Foundation Projects)
(1)Delta Lake
Open Source Lakehouse Storage Framework
Discussions
No discussions yet
Be the first to start a discussion about Delta Lake (Linux Foundation Projects)
Latest News
Products & Services
Core open-source storage layer that brings ACID transactions and reliability to data lakes.
A library for building Delta connectors for various engines without re-implementing protocol logic.
Feature allowing Delta tables to be read as Apache Iceberg or Apache Hudi without data duplication.
Official connectors for Spark, Flink, Presto, Trino, Rust, Python, and more.
Market Position
Positions as the high-performance alternative to Apache Iceberg and Apache Hudi, with superior Spark integration and UniForm for vendor-neutral interoperability.
Leadership
Founders
Michael Armbrust
Distinguished Software Engineer at Databricks. Previously at Google and Microsoft. Creator and lead architect of Delta Lake.
Ali Ghodsi
Co-founder and CEO of Databricks. Professor at UC Berkeley. Co-creator of Apache Spark.
Matei Zaharia
Co-founder and CTO of Databricks. Associate Professor at Stanford. Original creator of Apache Spark.
Reynold Xin
Co-founder and Chief Architect at Databricks. Leading the development of Apache Spark and Delta Lake.
Executive Team
Michael Armbrust
TSC Chair / Lead Maintainer
Distinguished Engineer at Databricks, lead architect of Delta Lake.
Dominique Brezinski
TSC Member
Distinguished Engineer at Apple, focused on security and data engineering.
Board of Directors
Founding Story
Originally developed at Databricks starting in 2016 to solve data reliability issues in Apache Spark. It was open-sourced in 2019 and donated to the Linux Foundation to establish an open standard for the data lakehouse.
Business Model
Revenue Model
Open source (Apache License 2.0). The project is free to use. Revenue is generated by commercial entities like Databricks through managed services.
Pricing Tiers
Full access to the storage framework, connectors, and UniForm under Apache 2.0.
Managed Delta Lake with advanced optimization (Liquid Clustering, Predictive Optimization) on the Databricks platform.
Target Markets
- Enterprise Data Engineering
- Data Science
- Cloud Data Platforms
- FinTech
- HealthTech
- Lakehouse storage
- Reliable ETL pipelines
- Unified analytics & ML
- Data warehousing on object storage
- Cross-cloud data sharing
- Starbucks
- Shell
- Walgreens
- Comcast