Data Ingestion
Universal Data Ingestion Framework
★ 3.9
Open Source Message Broker
★ 4.6
N/A — Java-basedpip install pikaN/A — Java-basedpip install pikaPython data engineers interact with Gobblin by defining configuration files that specify source, extractor, converter, and writer plugins — executed as a Hadoop or standalone Java job. Python orchestration scripts manage Gobblin execution via REST API, monitor job completion, and process ingested output files with PySpark for downstream transformation and loading.
Python data engineers use `pika` or `aio-pika` to connect pipelines to RabbitMQ. A common pattern is a Python producer that publishes enriched records to a topic exchange after transformation, and multiple consumer processes that subscribe to routing key patterns for parallel downstream processing. RabbitMQ's dead-letter queues handle failed processing with configurable retry logic.
Individual Tool Pages