ML-Powered Deduplication
Python library using machine learning to perform deduplication and entity resolution on structured data. Particularly useful for identifying and merging duplicate records.
Contains affiliate links
Data Validation & Documentation
Comprehensive tool helping data teams validate, document, and profile their data. Define expectations for your data ensuring it meets quality standards before processing.
Automated Data Profiling
Generates profile reports from pandas DataFrames. Excellent tool for quickly understanding data with interactive HTML reports including statistics, distributions, and correlations.