Data Ingestion
AWS Data Utility Belt for Python
★ 4.3
Managed Real-Time Streaming
★ 4.4
pip install awswranglerpip install boto3pip install awswranglerpip install boto3AWS Data Wrangler (now called `awswrangler`) is the standard tool for AWS-native Python data pipelines. Engineers replace `boto3` + `pandas` boilerplate with single calls: `wr.s3.read_parquet('s3://bucket/prefix/')` reads all files into a DataFrame, and `wr.s3.to_parquet(df, 's3://bucket/output/', dataset=True)` writes back with Glue catalog registration and partitioning.
Python data engineers use `boto3`'s Kinesis client to put records onto a Data Stream from Lambda functions or EC2-based producers. Consumer applications use the Kinesis Client Library (KCL) with Python bindings, or the `amazon-kinesis-client` Python wrapper, to process shards in parallel with automatic checkpointing — a common pattern for real-time log processing and event enrichment.
Individual Tool Pages