Discover 39 tools tagged with Cloud Native for Python data engineering.
Python Data Loading Library
Python library that facilitates the loading phase in ETL processes. Designed to simplify loading data into various data stores or processing systems.
Transform Data in Your Warehouse
Open-source transformation tool enabling data analysts and engineers to transform, test, and document data in the warehouse. Focuses on the transform part of ETL with SQL templating and Python scripting.
Kubernetes-Native Workflow Engine
Open-source container-native workflow engine for orchestrating parallel jobs on Kubernetes. Designed for large-scale computational tasks with powerful workflow features.
Async ORM for Python
Easy-to-use asyncio ORM inspired by Django. Designed for async/await syntax, making it perfect for asynchronous applications and modern Python development.
Distributed Event Streaming Platform
Distributed event streaming platform capable of handling trillions of events a day. Used for building real-time streaming data pipelines and applications with high-throughput, fault-tolerance, and scalability.
Stream Processing Framework
Framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Known for high performance in streaming data processing with exactly-once semantics.
End-to-End ML Platform
End-to-end open-source platform for machine learning enabling complex computations with data flow graphs. Widely used for deep learning applications with robust production support.
Unified Batch and Stream Processing
Advanced unified programming model for defining and executing data processing workflows that can run on any execution engine. Provides portability across multiple execution environments including Apache Flink, Apache Spark, and Google Cloud Dataflow. Ideal for building flexible, scalable data pipelines.
AWS SDK for Python
The official Amazon Web Services (AWS) SDK for Python. Enables Python developers to write software that makes use of services like Amazon S3, EC2, Lambda, and more. Provides easy-to-use, object-oriented API as well as low-level access to AWS services, making it simple to integrate Python applications with AWS infrastructure.
GCP SDK for Python
Google Cloud Platform's official client library for Python, enabling seamless integration with GCP services like Compute Engine, Cloud Storage, BigQuery, and Pub/Sub. Designed for a Pythonic, intuitive experience when interacting with Google Cloud services, with idiomatic code patterns and comprehensive documentation.
Microsoft Azure SDK
Microsoft's comprehensive Azure SDK for Python offering a complete set of packages to interact with Azure resources and services. Supports wide range of Azure services including Virtual Machines, Storage, Databases, AI services, and more. Provides tools for effective resource management and service interaction within Azure ecosystem.
IBM Cloud Services SDK
Official SDK for interacting with various IBM Cloud services programmatically. Provides comprehensive support for IBM Cloud services including CIS, DNS, IAM, VPC, Watson AI, and more. Enables management and automation of IBM Cloud resources with Python, compatible with Python 3.6 and above.
OCI SDK for Python
Official SDK for writing code to manage Oracle Cloud Infrastructure resources. Supports wide range of Oracle Cloud services with functionalities for compute, storage, networking, databases, and more. Available across multiple operating systems and Python versions, providing robust interface for OCI resource management.
Scalable Object Storage
Amazon Simple Storage Service offers industry-leading scalability, data availability, security, and performance for object storage. Commonly used for data backup, archival, big data analytics, disaster recovery, and content distribution. Provides 99.999999999% durability and integrates seamlessly with AWS analytics and ML services.
Scalable Virtual Servers
Amazon Elastic Compute Cloud provides secure, resizable compute capacity in the cloud. Offers wide selection of instance types optimized for different use cases including compute-intensive, memory-intensive, and storage-optimized workloads. Perfect for running data processing jobs, ML training, and distributed applications.
Cloud Data Warehouse
Fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to analyze all your data using standard SQL and existing BI tools. Offers fast query performance using columnar storage, data compression, and massively parallel query execution. Integrates with AWS data lake and analytics services.
Massively Scalable Object Storage
Microsoft's object storage solution for the cloud, optimized for storing massive amounts of unstructured data. Offers hot, cool, and archive access tiers for cost optimization. Ideal for serving images, documents, streaming video and audio, data lakes, backup and disaster recovery, and big data analytics.
Enterprise Data Lake
Scalable and secure data lake that enables high-performance analytics workloads. Built on Azure Blob Storage with hierarchical namespace capabilities. Integrates seamlessly with Azure analytics services like Synapse, Databricks, and HDInsight. Optimized for big data analytics with enterprise-grade security and compliance.
Unified Analytics Platform
Analytics service that brings together enterprise data warehousing and Big Data analytics. Provides unified experience to ingest, explore, prepare, manage, and serve data for immediate BI and machine learning needs. Supports both serverless and dedicated resource models with deep integration with Power BI and Azure ML.
Unified Object Storage
Unified object storage for developers and enterprises, from live applications data to cloud archival. Offers multiple storage classes including Standard, Nearline, Coldline, and Archive for cost optimization. Provides strong consistency, high durability, and seamless integration with Google Cloud data analytics and ML services.
High-Performance Virtual Machines
Offers virtual machines running in Google's innovative data centers and worldwide fiber network. Provides predefined and custom machine types, sustained use discounts, and per-second billing. Ideal for compute-intensive workloads, batch processing, and running distributed data processing frameworks like Spark and Hadoop.
Serverless Data Warehouse
Fast, economical, and fully managed serverless data warehouse for large-scale data analytics. Enables super-fast SQL queries using the processing power of Google's infrastructure. Built-in machine learning capabilities, automatic scaling, and pay-per-query pricing. Ideal for analyzing petabytes of data with standard SQL.
Database Design as Code
Free, simple tool to draw Entity-Relationship diagrams by just writing code. Designed to help developers design and visualize database structures in a straightforward and intuitive way. Perfect for quickly sketching database schemas and sharing them with your team through simple DSL syntax.
Collaborative Diagramming Platform
Online diagram application that makes it easy to sketch and share professional flowcharts and database diagrams. Offers comprehensive support for database design and ER diagrams with collaborative environment for teams. Real-time collaboration, extensive template library, and integrations with popular tools.
Document NoSQL Database
Document database with scalability and flexibility, featuring querying and indexing capabilities. Stores data as JSON documents, making it ideal for rapid development and horizontal scaling. Supports aggregation pipelines, transactions, and has rich Python driver support with PyMongo.
Enterprise Data Cloud
Enterprise data cloud offering storage, processing, and exploration capabilities for any data. Focuses on enterprise-level data management and analytics with comprehensive support for Hadoop ecosystem, machine learning, and real-time analytics. Provides hybrid and multi-cloud deployment options.
Enterprise Data Warehouse
Established enterprise data warehousing solution offering comprehensive capabilities for data warehousing, data lakes, and analytics. Known for scalability and hybrid cloud environment support. Provides advanced analytics, workload management, and integration with popular BI tools.
Unified Analytics Platform
Cloud data platform supporting data engineering, collaborative data science, machine learning, and analytics. Built on Apache Spark with Delta Lake for reliable data lakes. Ideal for organizations focusing on advanced analytics, ML workflows, and collaborative data science with notebooks.
Self-Managing Cloud Database
High-performance, self-managing data management service with automated patching, upgrading, and tuning. Particularly beneficial for enterprises in Oracle ecosystem or seeking highly automated data management. Features include automatic indexing, scaling, and security patching.
Cloud Data Platform
Cloud-native data platform supporting data warehousing, data lakes, data engineering, data science, and data sharing. Architecture separates compute and storage for independent scaling. Features include zero-copy cloning, time travel, automatic scaling, and multi-cloud support. Pay only for resources used.
Lightweight Async ORM
Lightweight and async-ready ORM designed to work with FastAPI and Starlette. Particularly suited for applications requiring asynchronous database operations with minimal overhead and modern Python async/await patterns.