A polyglot document intelligence library with a Rust core and bindings for Python, TypeScript, Go, and more. Kreuzberg extracts text and structured data from documents like PDFs, images, and office files for data ingestion pipelines.
Python data engineers use Kreuzberg to build document ingestion pipelines that extract text from uploaded PDFs, scanned images, and Office files. The async API integrates cleanly into FastAPI-based document processing services — an endpoint accepts a file upload, Kreuzberg extracts the text asynchronously, and the pipeline stores the result in a search index or warehouse for downstream analysis.
A polyglot document intelligence library with a Rust core and bindings for Python, TypeScript, Go, and more. Kreuzberg extracts text and structured data from documents like PDFs, images, and office files for data ingestion pipelines.
Yes, Kreuzberg is free to use.
Kreuzberg is listed under the Data Ingestion category on Python Data Engineering.
Details
Category
Data Ingestion →Related
| Tool | Pricing | Rating | |
|---|---|---|---|
AD AWS Data Wrangler AWS Data Utility Belt for Python | Free | ★ 4.3 | → |
CF CsvPath Frameworknew Delimited Data Preboarding | Free | ★ 3.7 | → |
MA Mage.AInew Data Pipeline Tool | Freemium | ★ 4.6 | → |