
OlivierBinette/Awesome-Entity-Resolution - GitHub
Splink (Python, SQL, Spark) - Scalable Fellegi-Sunter and rule-based entity resolution using your choice of SQL or Spark backend. Zingg (Python, Java) - Scalable, active learning model for …
GitHub - dedupeio/dedupe: :id: A python library for accurate and ...
dedupe is a python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on structured data. dedupe will help you: remove duplicate entries …
Scalable identity resolution, entity resolution, data mastering and ...
Zingg is an ML based tool for entity resolution. The following features set Zingg apart from other tools and libraries: Ability to handle any entity like customer, patient, supplier, product etc …
entity-resolution · GitHub Topics · GitHub
Jul 29, 2025 · Entity resolution Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to …
GitHub - dedupeio/dedupe-examples: :id: Examples for using the …
Example scripts for the dedupe, a library that uses machine learning to perform de-duplication and entity resolution quickly on structured data. Part of the Dedupe.io cloud service and open …
entity-resolution/entity_resolution_implementations.ipynb at main ...
Entity Resolution - Identifying Real-World Entities in Noisy Data Fundamental Theories and Python Implementations In this notebook, we will explore the technical details of fundamental …
GitHub - neo4j-graph-examples/entity-resolution: Entity resolution ...
Entity resolution, also known as Data Matching or Record linkage is the task of finding a data set that refer to the same or similar real entity across different digital entities present on same or …
Entity resolution for everyone. Minimal. No dependencies.
Entity resolution for everyone. Minimal. No dependencies. rezolva is a lightweight, flexible, and extensible entity resolution library implemented in pure Python. It's designed for simplicity, …
An open-source library that leverages Python’s data science
About An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.
moj-analytical-services/splink - GitHub
Splink is a Python package for probabilistic record linkage (entity resolution) that allows you to deduplicate and link records from datasets that lack unique identifiers. It is used widely by …