ML-Powered Deduplication
Python library using machine learning to perform deduplication and entity resolution on structured data. Particularly useful for identifying and merging duplicate records.
Contains affiliate links
Explore similar tools in the Data Quality category that complement Dedupe for your data engineering projects.
Data Validation & Documentation
Comprehensive tool helping data teams validate, document, and profile their data. Define expectations for your data ensuring it meets quality standards before processing.
Automated Data Profiling
Generates profile reports from pandas DataFrames. Excellent tool for quickly understanding data with interactive HTML reports including statistics, distributions, and correlations.