Tools for managing, versioning, and governing data lakes.
Data lake management tools provide version control, cataloging, and governance capabilities for data stored in data lakes. As organizations accumulate vast amounts of raw and processed data in object storage systems, managing this data becomes increasingly complex. These tools bring git-like versioning, transactional guarantees, and metadata management to data lakes, enabling data engineers to safely experiment with data transformations, roll back changes, and maintain data quality at scale.
Git-Like Data Lake Versioning
An open-source platform that delivers resilience and manageability to object-storage-based data lakes. lakeFS provides git-like branching, merging, and versioning for data, enabling safe experimentation and CI/CD workflows for data pipelines.
Transactional Data Lake Catalog
A transactional catalog for data lakes with git-like semantics. Nessie works with Apache Iceberg tables to provide multi-table transactions, branching, tagging, and time-travel queries across your data lake.
Data Lake Bronze Layer Gateway
A gateway to a data lake's bronze layer that handles raw data ingestion and landing. FlightPath provides a managed entry point for data flowing into your data lake, ensuring consistent formatting and quality at the ingestion stage.