Discover 18 tools tagged with Stream Processing for Python data engineering.
Data Flow Automation
Easy-to-use, powerful, and reliable system to process and distribute data, offering a web-based user interface for data flow management.
Distributed Event Streaming Platform
Distributed event streaming platform capable of handling trillions of events a day. Used for building real-time streaming data pipelines and applications with high-throughput, fault-tolerance, and scalability.
Stream Processing Framework
Framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Known for high performance in streaming data processing with exactly-once semantics.
Real-Time Computation System
Real-time computation system making it easy to process unbounded streams of data reliably. Fast and scalable distributed real-time computation framework for stream processing.
Scalable Stream Processing
Extension of Apache Spark API enabling scalable, high-throughput, fault-tolerant processing of live data streams. Integrated within Spark ecosystem for complex real-time data processing tasks.
Unified Batch and Stream Processing
Advanced unified programming model for defining and executing data processing workflows that can run on any execution engine. Provides portability across multiple execution environments including Apache Flink, Apache Spark, and Google Cloud Dataflow. Ideal for building flexible, scalable data pipelines.
Distributed Stream Processing Framework
A distributed stream processing framework that uses Apache Kafka for messaging and Apache Hadoop YARN for fault tolerance, processor isolation, security, and resource management. Samza provides a simple API for building stateful stream processing applications.
Incremental Data Processing Framework
An open-source framework for managing storage for real-time data processing on top of data lakes. Hudi provides record-level insert, update, and delete capabilities along with change streams, enabling incremental data pipelines on large-scale datasets.
Streaming SQL Database
A streaming SQL database that runs SQL queries continuously on incoming data streams. PipelineDB is built as a PostgreSQL extension, allowing you to use standard SQL to define continuous views over streaming data for real-time analytics.
Real-Time Streaming Data Platform
A framework for building real-time streaming data processing applications. SwimOS combines streaming data processing with a built-in state store and UI capabilities, enabling continuous intelligence applications that process and visualize data in real-time.
Distributed Pub-Sub Messaging
An open-source distributed pub-sub messaging system originally created by Yahoo. Pulsar provides multi-tenancy, geo-replication, and unified messaging and streaming with a serverless compute framework for lightweight processing.
Managed Real-Time Streaming
A fully managed, cloud-based service from AWS for real-time data streaming and processing. Kinesis enables collecting, processing, and analyzing streaming data at any scale, with integrations across the AWS ecosystem.