Data Governance & Metadata
Enterprise Data Governance
★ 4.2
Metadata Service for Data Lineage
★ 4.3
pip install apache-atlaspip install marquez-clientpip install apache-atlaspip install marquez-clientPython data engineers integrate with Apache Atlas via its REST API to register custom data assets, query lineage graphs, and enforce data classification policies. Post-ingestion scripts tag newly created tables with PII labels, and lineage queries trace how specific columns flow from source systems through transformations to the final warehouse tables.
Python data engineers integrate Marquez with Airflow using the `openlineage-airflow` package, which automatically emits lineage events for each task — capturing which datasets a task reads and writes without any code changes. Engineers query the Marquez API to build impact analysis tools that identify downstream jobs affected by an upstream schema change.
Individual Tool Pages