Polyglot Document Intelligence
A polyglot document intelligence library with a Rust core and bindings for Python, TypeScript, Go, and more. Kreuzberg extracts text and structured data from documents like PDFs, images, and office files for data ingestion pipelines.
Explore similar tools in the Data Ingestion category that complement Kreuzberg for your data engineering projects.
Open Source Message Broker
A robust, open-source message broker that supports multiple messaging protocols including AMQP, MQTT, and STOMP. RabbitMQ provides reliable message delivery with flexible routing, clustering, and federation for distributed data ingestion pipelines.
Distributed Pub-Sub Messaging
An open-source distributed pub-sub messaging system originally created by Yahoo. Pulsar provides multi-tenancy, geo-replication, and unified messaging and streaming with a serverless compute framework for lightweight processing.