24 results found Sort:

557
4.2k
mit
119
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Created 2012-04-20
3,332 commits to main branch, last one 4 months ago
432
4.2k
mit
107
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
Created 2015-03-03
5,472 commits to master branch, last one about a month ago
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
Created 2019-11-22
9,267 commits to master branch, last one 4 days ago
156
991
bsd-3-clause
32
A powerful and modular toolkit for record linkage and duplicate detection in Python
Created 2015-10-18
912 commits to master branch, last one about a year ago
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
Created 2016-03-25
556 commits to master branch, last one 3 years ago
:id: Examples for using the dedupe library
Created 2014-04-02
1,356 commits to main branch, last one 7 months ago
A list of free data matching and record linkage software.
Created 2018-01-01
53 commits to master branch, last one about a year ago
Super Fast String Matching in Python
Created 2020-01-02
51 commits to master branch, last one 26 days ago
22
188
other
9
🔎 Finds fuzzy matches between CSV files
Created 2015-12-08
144 commits to master branch, last one about a month ago
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Created 2021-01-28
373 commits to main branch, last one 2 years ago
54
130
agpl-3.0
26
Link Discovery Framework for Metric Spaces.
Created 2015-09-16
1,911 commits to master branch, last one 9 months ago
36
125
apache-2.0
11
Spark RDD with Lucene's query and entity linkage capabilities
Created 2016-02-03
954 commits to develop branch, last one about a month ago
Resources for tackling record linkage / deduplication / data matching problems
Created 2017-11-04
29 commits to master branch, last one 2 years ago
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
Created 2023-08-01
128 commits to main branch, last one about a month ago
23
109
mit
11
Record Linkage ToolKit (Find and link entities)
Created 2017-02-15
636 commits to master branch, last one 3 years ago
9
96
gpl-3.0
6
Link Wikidata items to large catalogs
Created 2018-07-11
2,087 commits to master branch, last one 3 years ago
Python package for deduplication/entity resolution using active learning
Created 2021-04-13
294 commits to main branch, last one 7 months ago
8
65
apache-2.0
7
Python implementation of anonymous linkage using cryptographic linkage keys
Created 2017-05-30
470 commits to main branch, last one about a year ago
List of entity resolution software and resources.
Created 2023-10-23
9 commits to main branch, last one 29 days ago
2
57
apache-2.0
7
Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.
Created 2018-11-05
146 commits to main branch, last one 3 months ago
9
57
other
10
Distributed Bayesian Entity Resolution in Apache Spark
Created 2018-08-27
152 commits to master branch, last one 3 years ago
A browser user interface for manual labeling of record pairs.
Created 2019-11-02
42 commits to master branch, last one about a year ago
Record matching and entity resolution at scale in Spark
Created 2022-04-11
40 commits to main branch, last one about a year ago
Fast, accurate, open-source geocoding in Python
Created 2023-08-06
280 commits to main branch, last one 4 days ago