11 results found Sort:

174
848
unknown
63
Four word embedding models implemented in Python. Supporting arbitrary context features
Created 2017-07-16
257 commits to master branch, last one 5 years ago
A TUI tool to help you type faster and learn new layouts. Includes a free cat.
Created 2024-05-21
19 commits to master branch, last one 29 days ago
41
209
unknown
7
Touch typing trainer using N-grams as data source, with options to customize the auto-generated lessons and specify the minimum typing performance needed. There are sound/color effects as well.
Created 2020-10-25
99 commits to master branch, last one about a year ago
20
124
gpl-3.0
12
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dy...
Created 2013-09-21
1,449 commits to master branch, last one 4 days ago
5
104
unknown
8
Cluster and merge similar string values: an R implementation of Open Refine clustering algorithms
Created 2017-03-04
246 commits to master branch, last one 9 months ago
20
78
mit
6
Get n-grams from text
Created 2014-09-18
110 commits to main branch, last one 2 years ago
A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard si...
Created 2017-03-02
203 commits to master branch, last one 2 years ago
24
71
other
11
Fast n-Gram Tokenization
Created 2014-05-06
329 commits to master branch, last one about a year ago
Top-k Approximate String Matching.
Created 2017-02-04
292 commits to master branch, last one 3 years ago
大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning
Created 2023-12-12
23 commits to master branch, last one 4 months ago
利用传统方法(N-gram,HMM等)、神经网络方法(CNN,LSTM等)和预训练方法(Bert等)的中文分词任务实现【The word segmentation task is realized by using traditional methods (n-gram, HMM, etc.), neural network methods (CNN, LSTM, etc.) and pre tr...
Created 2022-04-05
4 commits to master branch, last one 2 years ago