Data Governance & Metadata
Open Data Management System
★ 4.1
Metadata Service for Data Lineage
★ 4.3
pip install ckanapipip install marquez-clientpip install ckanapipip install marquez-clientPython data engineers use the `ckanapi` library to programmatically harvest open datasets from CKAN portals — listing available datasets, downloading CSV or JSON resources, and ingesting them into internal pipelines. Government open data platforms (data.gov, data.gov.uk) run on CKAN, making it the standard entry point for public data ingestion workflows.
Python data engineers integrate Marquez with Airflow using the `openlineage-airflow` package, which automatically emits lineage events for each task — capturing which datasets a task reads and writes without any code changes. Engineers query the Marquez API to build impact analysis tools that identify downstream jobs affected by an upstream schema change.
Data Governance & Metadata
Amundsen vs Apache Atlas
Data Governance & Metadata
Apache Atlas vs CKAN
Data Governance & Metadata
Apache Atlas vs Marquez
Data Governance & Metadata
Apache Atlas vs DataHub
Data Governance & Metadata
Apache Atlas vs Collibra
Data Governance & Metadata
Apache Atlas vs Apache Gravitino
Individual Tool Pages