Discover 20 tools tagged with Lightweight for Python data engineering.
Lightweight tools have minimal dependencies, small footprints, and simple configuration, making them ideal for embedded use cases, edge computing, and development environments. Python data engineers choose lightweight tools when full-featured frameworks like Spark or Airflow are excessive for the scale or complexity of the pipeline.
Web Scraping & HTML Parsing
Library for web scraping and parsing HTML/XML documents. Extensively used in data wrangling to clean, parse, and extract data from web sources.
Python Data Structure Validation
Validates Python data structures with straightforward syntax and clear error messages. Ensures structure and content adhere to specified schemas.
Lightweight Async ORM
Lightweight and async-ready ORM designed to work with FastAPI and Starlette. Particularly suited for applications requiring asynchronous database operations with minimal overhead and modern Python async/await patterns.
CLI Data Integration Tool
A CLI data integration tool specialized in moving data between databases and storage systems. Sling provides a simple command-line interface for extracting and loading data with support for incremental syncs, transformations, and multiple output formats.
Sheets to Data Warehouse Loader
An open-source tool for live importing all your Google Sheets to your data warehouse. Google Sheets ETL automates the extraction of spreadsheet data into structured tables, bridging the gap between business users and data infrastructure.
DataFrame Comparison Library
A Python library by Capital One that facilitates the comparison of two DataFrames across Pandas, Polars, Spark, and more. datacompy provides detailed match reports with configurable tolerance levels, ideal for validating data pipeline outputs.
Extremely fast Python package manager written in Rust
uv is an extremely fast Python package and project manager written in Rust by Astral. It replaces pip, pip-tools, virtualenv, pyenv, pipx, and poetry in a single unified tool, delivering 10–100x faster dependency resolution and installation through intelligent global caching.