A transactional catalog for data lakes with git-like semantics. Nessie works with Apache Iceberg tables to provide multi-table transactions, branching, tagging, and time-travel queries across your data lake.
Python data engineers configure PySpark to use Project Nessie as the Iceberg catalog — enabling table branching within Spark jobs. An engineer creates a Nessie branch, runs a PySpark transformation that modifies multiple Iceberg tables, validates the results, then merges the branch to main — providing atomic multi-table updates with full rollback capability.
A transactional catalog for data lakes with git-like semantics. Nessie works with Apache Iceberg tables to provide multi-table transactions, branching, tagging, and time-travel queries across your data lake.
Yes, Project Nessie is free to use.
Project Nessie is listed under the Data Lake Management category on Python Data Engineering.
Details
Related
| Tool | Pricing | Rating | |
|---|---|---|---|
AG Apache Gravitinonew Unified Metadata Management | Free | ★ 4.0 | → |
AH Apache HBase Distributed Column-Family Store | Free | ★ 4.2 | → |
TI Titan Scalable Graph Database | Free | ★ 3.6 | → |