Open-source metadata platform for the modern data stack. Provides powerful and flexible metadata search, discovery, and lineage capabilities. Features real-time metadata updates, data quality monitoring, and governance workflows. Extensive Python SDK for automation and integration.
Python data engineers use DataHub's Python SDK and ingestion framework to crawl metadata from databases, dbt projects, and Airflow — writing YAML recipe files that the `datahub` CLI ingests on a schedule. Custom Python emitters push metadata about internal pipeline assets that built-in connectors don't cover.
Open-source metadata platform for the modern data stack. Provides powerful and flexible metadata search, discovery, and lineage capabilities. Features real-time metadata updates, data quality monitoring, and governance workflows. Extensive Python SDK for automation and integration.
Yes, DataHub is free to use.
DataHub is listed under the Data Governance & Metadata category on Python Data Engineering.
Details
Related
| Tool | Pricing | Rating | |
|---|---|---|---|
AM Amundsenfeatured Data Discovery & Metadata Engine | Free | ★ 4.5 | → |
CK CKAN Open Data Management System | Free | ★ 4.1 | → |
MA Marquez Metadata Service for Data Lineage | Free | ★ 4.3 | → |