30 results found Sort:

1.2k
10.3k
apache-2.0
127
Unsupervised text tokenizer for Neural Network-based text generation.
Created 2017-03-07
986 commits to master branch, last one 3 months ago
595
3.9k
apache-2.0
105
百度NLP:分词,词性标注,命名实体识别,词重要性
Created 2018-07-02
155 commits to master branch, last one 3 years ago
298
3.2k
mit
71
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Created 2014-03-25
488 commits to master branch, last one 26 days ago
274
988
apache-2.0
46
Thai natural language processing in Python
Created 2016-06-23
4,785 commits to dev branch, last one 3 days ago
103
959
mit
26
Unsupervised text tokenizer focused on computational efficiency
This repository has been archived (exclude archived)
Created 2019-06-06
85 commits to master branch, last one about a year ago
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Created 2018-08-13
292 commits to master branch, last one 5 days ago
CKIP Transformers
Created 2020-08-17
116 commits to master branch, last one about a year ago
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashta...
Created 2017-02-07
77 commits to master branch, last one 2 years ago
125
546
apache-2.0
19
BERT for Multitask Learning
Created 2018-11-29
535 commits to master branch, last one about a year ago
45
433
other
16
Kiwi(지능형 한국어 형태소 분석기)
Created 2017-02-21
1,117 commits to main branch, last one 2 days ago
38
417
apache-2.0
13
AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models
Created 2022-12-01
246 commits to master branch, last one about a year ago
22
391
mit
12
A Japanese tokenizer based on recurrent neural networks
Created 2018-02-14
194 commits to master branch, last one 5 months ago
44
381
apache-2.0
31
Juman++ (a Morphological Analyzer Toolkit)
Created 2016-10-11
1,093 commits to master branch, last one about a year ago
Cantonese Linguistics and NLP
Created 2014-12-13
328 commits to main branch, last one 6 months ago
51
337
apache-2.0
4
Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label c...
Created 2021-08-29
13 commits to main branch, last one 4 months ago
28
288
other
7
Python API for Kiwi
Created 2019-09-06
670 commits to main branch, last one 2 days ago
A PyTorch implementation of the BI-LSTM-CRF model.
Created 2019-12-03
6 commits to master branch, last one 24 days ago
26
245
other
23
MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型
Created 2019-07-23
54 commits to master branch, last one 2 years ago
8
198
bsd-2-clause
3
轻量级高性能中文分词项目
Created 2023-01-27
7 commits to devel branch, last one about a year ago
9
119
apache-2.0
7
A comparison tool of Japanese tokenizers
Created 2020-08-13
74 commits to master branch, last one about a year ago
15
115
gpl-3.0
9
CKIP CoreNLP Toolkits
Created 2019-04-01
259 commits to master branch, last one about a year ago
🗺️ 一个自然语言处理的学习路线图
Created 2019-04-17
42 commits to master branch, last one about a year ago
17
103
unlicense
8
Converts from Chinese characters to pinyin, between simplified and traditional, and does word segmentation.
Created 2016-11-23
75 commits to master branch, last one 2 years ago
A Fast and Accurate Vietnamese Word Segmenter (LREC 2018)
Created 2017-09-19
29 commits to master branch, last one 2 years ago
Fast Word Segmentation with Triangular Matrix
Created 2018-04-21
33 commits to master branch, last one 3 years ago
A toolkit for Vietnamese word segmentation
Created 2016-04-11
26 commits to master branch, last one 2 years ago
19
56
apache-2.0
7
Syllable segmentation tool for Myanmar language (Burmese) by Ye.
Created 2017-04-01
189 commits to master branch, last one 10 months ago
A toolkit for pre-processing large source code corpora
Created 2019-04-05
319 commits to master branch, last one 3 years ago