Delta Lake (Linux Foundation Projects)
Delta Lake is an open-source storage framework that enables building a format-agnostic Lakehouse architecture, unifying ETL, data warehousing, and machine learning workloads.
Founding Story
Originally developed at Databricks starting in 2016 to solve data reliability issues in Apache Spark. It was open-sourced in 2019 and donated to the Linux Foundation to establish an open standard for the data lakehouse.
Discussions
No discussions yet
Be the first to start a discussion about Delta Lake (Linux Foundation Projects)
Leadership
Founders
Michael Armbrust
Distinguished Software Engineer at Databricks. Previously at Google and Microsoft. Creator and lead architect of Delta Lake.
Ali Ghodsi
Co-founder and CEO of Databricks. Professor at UC Berkeley. Co-creator of Apache Spark.
Matei Zaharia
Co-founder and CTO of Databricks. Associate Professor at Stanford. Original creator of Apache Spark.
Reynold Xin
Co-founder and Chief Architect at Databricks. Leading the development of Apache Spark and Delta Lake.
Executive Team
Michael Armbrust
TSC Chair / Lead Maintainer
Distinguished Engineer at Databricks, lead architect of Delta Lake.
Dominique Brezinski
TSC Member
Distinguished Engineer at Apple, focused on security and data engineering.
Business Model
Revenue Model
Open source (Apache License 2.0). The project is free to use. Revenue is generated by commercial entities like Databricks through managed services.
Pricing Tiers
Full access to the storage framework, connectors, and UniForm under Apache 2.0.
Managed Delta Lake with advanced optimization (Liquid Clustering, Predictive Optimization) on the Databricks platform.
Target Markets
- Enterprise Data Engineering
- Data Science
- Cloud Data Platforms
- FinTech
- HealthTech
- Lakehouse storage
- Reliable ETL pipelines
- Unified analytics & ML
- Data warehousing on object storage
- Cross-cloud data sharing
- Starbucks
- Shell
- Walgreens
- Comcast
History & Milestones
Announcement of Catalog-Managed Tables, shifting transaction coordination to Unity Catalog.
Delta Lake 4.0 is released with enhanced catalog integration and smarter change tracking.
Delta Lake 3.0 is released, introducing UniForm (Universal Format) for Iceberg and Hudi compatibility.
Delta Lake 2.0 is announced, making all proprietary Databricks Delta APIs open source.
Delta Lake is open-sourced by Databricks at the Spark + AI Summit.
1 AI Tool by Delta Lake (Linux Foundation Projects)
Delta Lake is an open-source storage framework that enables building format-agnostic Lakehouse architectures with ACID transactions, scalable metadata, and time travel capabilities.
