Data Ingestion
Distributed Pub-Sub Messaging
★ 4.5
Delimited Data Preboarding
★ 3.7
pip install pulsar-clientpip install csvpathpip install pulsar-clientpip install csvpathPython data engineers use the `pulsar-client` Python SDK to produce and consume messages from Pulsar topics. Pulsar Functions can be written in Python to perform lightweight transformations — filtering, enriching, or routing messages — without deploying a separate Faust or Spark Streaming cluster. Pulsar's topic compaction and retention policies simplify stateful event stream management.
Python data engineers use CSVPath to validate complex CSV files with business rules that go beyond column type checking — enforcing conditional constraints (if column A has value X, column B must be non-null), cross-row lookups, and custom matching expressions. CSVPath rules are stored as text files separate from the Python pipeline code, making them auditable by non-developers.
Individual Tool Pages