21 results found Sort:
- Filter by Primary Language:
- Python (12)
- JavaScript (2)
- Scala (2)
- C (1)
- Jupyter Notebook (1)
- +
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Created
2012-04-20
3,332 commits to main branch, last one 19 days ago
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
Created
2015-03-03
5,370 commits to master branch, last one about a year ago
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
Created
2019-11-22
9,008 commits to master branch, last one a day ago
A powerful and modular toolkit for record linkage and duplicate detection in Python
Created
2015-10-18
912 commits to master branch, last one about a year ago
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Created
2016-03-25
556 commits to master branch, last one 3 years ago
:id: Examples for using the dedupe library
Created
2014-04-02
1,356 commits to main branch, last one 3 months ago
A list of free data matching and record linkage software.
Created
2018-01-01
53 commits to master branch, last one about a year ago
🔎 Finds fuzzy matches between CSV files
Created
2015-12-08
134 commits to master branch, last one 7 months ago
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Created
2021-01-28
373 commits to main branch, last one 2 years ago
Link Discovery Framework for Metric Spaces.
Created
2015-09-16
1,911 commits to master branch, last one 5 months ago
Spark RDD with Lucene's query and entity linkage capabilities
Created
2016-02-03
953 commits to develop branch, last one 15 days ago
Resources for tackling record linkage / deduplication / data matching problems
Created
2017-11-04
29 commits to master branch, last one 2 years ago
Record Linkage ToolKit (Find and link entities)
Created
2017-02-15
636 commits to master branch, last one 3 years ago
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
Created
2023-08-01
118 commits to main branch, last one 5 months ago
Link Wikidata items to large catalogs
Created
2018-07-11
2,087 commits to master branch, last one 2 years ago
Python package for deduplication/entity resolution using active learning
Created
2021-04-13
294 commits to main branch, last one 2 months ago
Python implementation of anonymous linkage using cryptographic linkage keys
Created
2017-05-30
470 commits to main branch, last one about a year ago
Distributed Bayesian Entity Resolution in Apache Spark
Created
2018-08-27
152 commits to master branch, last one 3 years ago
Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.
Created
2018-11-05
144 commits to main branch, last one 22 days ago
List of entity resolution software and resources.
Created
2023-10-23
7 commits to main branch, last one 8 months ago
Record matching and entity resolution at scale in Spark
Created
2022-04-11
40 commits to main branch, last one about a year ago