A lightweight Node.js ETL framework for moving data from databases to data lakes and data warehouses. db2lake provides simple configuration-driven extraction with support for incremental loads and multiple output formats.
Python data engineers use db2lake to bootstrap data lake migration projects — extracting historical data from relational databases and writing it as partitioned Parquet files to S3 or HDFS. Once the initial migration is done, incremental extractions keep the lake in sync, and Python-based PySpark or DuckDB pipelines take over for ongoing processing.
A lightweight Node.js ETL framework for moving data from databases to data lakes and data warehouses. db2lake provides simple configuration-driven extraction with support for incremental loads and multiple output formats.
Yes, db2lake is free to use.
db2lake is listed under the Data Ingestion category on Python Data Engineering.
Details
Related
| Tool | Pricing | Rating | |
|---|---|---|---|
BO Bonobo Lightweight ETL Framework | Free | ★ 4.2 | → |
NU NumPyfeatured Numerical Computing Library | Free | ★ 4.9 | → |
BS Beautiful Soup Web Scraping & HTML Parsing | Free | ★ 4.5 | → |