An open-source CDC platform that monitors databases and streams every committed row-level change as a structured event. Debezium reads directly from database replication logs — capturing inserts, updates, and deletes in real time with no polling and no impact on query performance.
Python data engineers typically run Debezium as the CDC producer and write Python consumers of the change streams it generates. After deploying Debezium connectors via Docker Compose or Kubernetes, Python services consume CDC events from Kafka topics using confluent-kafka or kafka-python — receiving full before/after row images for every database change, which are then written as Parquet to S3 or applied as upserts to a data warehouse. For teams without Kafka, Debezium Server sinks directly to AWS Kinesis or Redis Streams, both of which have first-class Python client libraries (boto3, redis-py), keeping the Python integration straightforward.
An open-source CDC platform that monitors databases and streams every committed row-level change as a structured event. Debezium reads directly from database replication logs — capturing inserts, updates, and deletes in real time with no polling and no impact on query performance.
Yes, Debezium is free to use.
Debezium is listed under the Data Ingestion category on Python Data Engineering.
Details
Related
| Tool | Pricing | Rating | |
|---|---|---|---|
AD Apache Druid Real-Time Analytics Database | Free | ★ 4.3 | → |
RA RabbitMQfeatured Open Source Message Broker | Free | ★ 4.6 | → |
AP Apache Pulsarfeatured Distributed Pub-Sub Messaging | Free | ★ 4.5 | → |