Discover 23 tools tagged with Self-Hosted for Python data engineering.
Self-hosted tools can be deployed and operated on your own infrastructure — on-premise servers, private cloud, or VMs — rather than relying on a managed SaaS vendor. Python data engineers choose self-hosted options for data sovereignty, cost control, customisation, and compliance requirements that prohibit sending data to third-party cloud services.
Event-Driven Orchestration Platform
A scalable, event-driven, language-agnostic orchestration and scheduling platform. Kestra provides a declarative YAML-based workflow definition with a rich UI, supporting hundreds of plugins for data engineering, DevOps, and microservice orchestration.
Modern BI Web Application
A modern, enterprise-ready business intelligence web application. Superset provides an intuitive interface for creating interactive dashboards, exploring data through SQL, and building rich visualizations without writing code.
Open Source Message Broker
A robust, open-source message broker that supports multiple messaging protocols including AMQP, MQTT, and STOMP. RabbitMQ provides reliable message delivery with flexible routing, clustering, and federation for distributed data ingestion pipelines.
Event Messaging Platform
An open-source event messaging platform that provides a REST API on top of Kafka-like queues. Nakadi simplifies event streaming by offering schema registration, data governance, and subscription-based consumption without direct Kafka client management.
Open-Source Change Data Capture Platform
An open-source CDC platform that monitors databases and streams every committed row-level change as a structured event. Debezium reads directly from database replication logs — capturing inserts, updates, and deletes in real time with no polling and no impact on query performance.
Git-Like Data Lake Versioning
An open-source platform that delivers resilience and manageability to object-storage-based data lakes. lakeFS provides git-like branching, merging, and versioning for data, enabling safe experimentation and CI/CD workflows for data pipelines.
Open-Source Monitoring System
An open-source systems monitoring and alerting toolkit with a powerful multi-dimensional data model and flexible query language (PromQL). Prometheus is the standard for monitoring cloud-native and Kubernetes-based data infrastructure.
Observability & Dashboarding Platform
An open-source analytics and interactive visualization platform. Grafana connects to dozens of data sources including Prometheus, InfluxDB, and Elasticsearch to create rich monitoring dashboards for data infrastructure and pipeline health.
Extremely fast Python package manager written in Rust
uv is an extremely fast Python package and project manager written in Rust by Astral. It replaces pip, pip-tools, virtualenv, pyenv, pipx, and poetry in a single unified tool, delivering 10–100x faster dependency resolution and installation through intelligent global caching.
Python dependency management and packaging in one tool
Poetry is a Python dependency management and packaging tool that consolidates setup.py, requirements.txt, and Pipfile into a single pyproject.toml. It provides deterministic builds via a lockfile, automatic virtual environment management, and one-command PyPI publishing.