ETL Frameworks
Open-Source Data Integration Platform
★ 4.6
Data Manipulation & Analysis Library
★ 4.9
pip install airbytepip install pandaspip install airbytepip install pandasPython data engineers use Airbyte to replace hand-written API ingestion scripts — selecting a pre-built connector for the source API, configuring credentials, and Airbyte handles pagination, rate limiting, and incremental sync automatically. The Python CDK is used when a required source connector doesn't exist, enabling engineers to build and publish custom connectors.
Pandas is the go-to tool for data wrangling in Python pipelines. Engineers use DataFrames to load raw data from CSVs or databases, clean and transform it (renaming columns, filtering rows, filling nulls), then write results to Parquet or a data warehouse. It is the standard intermediate layer between data ingestion and downstream processing.
Individual Tool Pages