ETL Frameworks
Data Manipulation & Analysis Library
★ 4.9
Python ETL Package
★ 4.3
pip install pandaspip install petlpip install pandaspip install petlPandas is the go-to tool for data wrangling in Python pipelines. Engineers use DataFrames to load raw data from CSVs or databases, clean and transform it (renaming columns, filtering rows, filling nulls), then write results to Parquet or a data warehouse. It is the standard intermediate layer between data ingestion and downstream processing.
Python data engineers use petl for lightweight, script-based ETL tasks where Spark or Airflow would be overkill. A typical pipeline reads from a CSV or SQLite source, applies field renames and filters, then writes to a Postgres table — all in under 50 lines of readable Python.
Individual Tool Pages