lakeFS
lakeFS is a data version control platform that brings Git-like branching, merging, and rollback capabilities to data lakes, enabling AI and data teams to manage data lifecycle, provenance, and access at scale.
At a Glance
Pricing
Free forever open-source version of lakeFS with core data version control features.
Engagement
Available On
Listed Mar 2026
About lakeFS
lakeFS is a scalable data version control system built by Treeverse that applies proven software engineering practices to data lake management. It enables teams to branch, merge, commit, and roll back data just like code, providing isolated environments for testing, reproducible ML experiments, and atomic data promotion. Trusted by organizations like Netflix, Volvo, Lockheed Martin, and Amazon, lakeFS integrates with virtually every major data and AI stack without moving data out of your storage. It is available as an open-source project and as a managed Enterprise offering with advanced security and governance features.
- Data Branching & Merging: Create zero-copy branches of your data lake for isolated testing and experimentation, then atomically merge changes back to production.
- Format-Agnostic Version Control: Works with any data format—Parquet, CSV, Avro, JSON, Delta Lake, Iceberg, Hudi, and unstructured data like images and video.
- Data CI/CD with Hooks: Enforce data quality and compliance standards automatically using lakeFS hooks before changes reach production.
- Instant Rollback: Recover from data incidents immediately by reverting to any previous commit without duplicating data.
- Audit Trail & Lineage: Gain full visibility into data history with built-in audit logs to satisfy model governance and compliance requirements.
- Role-Based Access Control (RBAC): Enterprise plan includes RBAC, SSO, SCIM, and IAM Roles for fine-grained, secure access management across teams.
- lakeFS Mount: Virtually mount remote lakeFS repositories as a local filesystem for high-performance deep learning workloads.
- Transactional Mirroring: Replicate repositories to remote regions for disaster recovery and data locality without data inconsistency.
- Broad Integrations: Connects natively with Spark, Databricks, Airflow, Kafka, Flink, Airbyte, dbt, MLflow, Kubeflow, AWS SageMaker, and many more tools.
- Cloud & Storage Agnostic: Supports AWS S3, Azure Blob, Google Cloud Storage, MinIO, Ceph, Dell EMC, and on-premises storage via the S3 interface.
To get started, run lakeFS locally using the quickstart guide at docs.lakefs.io, or sign up for lakeFS Cloud. Connect your existing object storage, create a repository, and begin branching your data just like a Git workflow.
Community Discussions
Be the first to start a conversation about lakeFS
Share your experience with lakeFS, ask questions, or help others learn from your insights.
Pricing
Free Plan Available
Free forever open-source version of lakeFS with core data version control features.
- Format-Agnostic Data Version Control
- Cloud-Agnostic
- Zero Clone copy for isolated environment (via branches)
- Atomic Data Promotion (via merges)
- Data Stays in One Place
Enterprise
Full-featured enterprise plan with unlimited seats, advanced security, governance, and SLA support.
- All Open Source features
- Role-Based Access Control (RBAC)
- Single Sign On (SSO)
- SCIM Support
- IAM Roles
- Mount Capability
- Audit Logs
- Transactional Mirroring
- Iceberg REST Catalog
- Metadata Search
- Multiple Storage Backends Support
- Simplified Garbage Collection (Managed or Standalone)
- SOC2
- Support SLA
- Unlimited seats
Capabilities
Key Features
- Data branching and merging (zero-copy)
- Atomic data promotion via merges
- Data CI/CD using lakeFS Hooks
- Instant rollback from data incidents
- Built-in audit trail and data lineage
- Role-Based Access Control (RBAC)
- Single Sign-On (SSO)
- SCIM Support
- IAM Roles authentication
- lakeFS Mount for local filesystem access
- Transactional Mirroring (cross-region)
- Configurable Garbage Collection
- Metadata Search
- Iceberg REST Catalog
- Multiple Storage Backends Support
- Format-agnostic version control
- Cloud-agnostic deployment
- Private-link support
- SOC2 compliance
Integrations
Demo Video

