16 results found Sort:

2.3k
11.8k
apache-2.0
285
100+ Chinese Word Vectors 上百种预训练中文词向量
Created 2018-01-09
134 commits to master branch, last one 11 months ago
pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
Created 2018-08-05
200 commits to master branch, last one 2 years ago
597
3.9k
apache-2.0
106
百度NLP:分词,词性标注,命名实体识别,词重要性
Created 2018-07-02
155 commits to master branch, last one 3 years ago
611
3.3k
mit
86
Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类
Created 2018-12-30
107 commits to master branch, last one 2 years ago
287
3.1k
mit
71
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Created 2014-03-25
485 commits to master branch, last one 5 months ago
806
3.1k
apache-2.0
84
中文分词
Created 2018-03-19
218 commits to master branch, last one 5 months ago
273
1.8k
unknown
60
Datasets, SOTA results of every fields of Chinese NLP
Created 2019-05-16
278 commits to master branch, last one 3 years ago
211
912
apache-2.0
92
Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction imp...
Created 2014-03-31
680 commits to master branch, last one about a year ago
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Created 2018-08-13
284 commits to master branch, last one 22 days ago
The Jieba Chinese Word Segmentation Implemented in Rust
Created 2018-05-06
314 commits to main branch, last one 3 months ago
93
479
apache-2.0
33
High performance Chinese tokenizer with both GBK and UTF-8 charset support based on MMSEG algorithm developed by ANSI C. Completely based on modular implementation and can be easily embedded in other...
Created 2014-03-31
148 commits to master branch, last one about a year ago
26
245
other
23
MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型
Created 2019-07-23
54 commits to master branch, last one 2 years ago
41
199
unknown
1
A PyTorch implementation of a BiLSTM \ BERT \ Roberta (+ BiLSTM + CRF) model for Chinese Word Segmentation (中文分词) .
Created 2021-03-25
41 commits to main branch, last one 2 years ago
一个微型&算法全面的中文分词引擎 | A micro tokenizer for Chinese
Created 2018-06-12
387 commits to master branch, last one 3 years ago
33
85
apache-2.0
11
Chinese Word Segmentation Tool, THULAC的Java实现.
Created 2017-03-03
33 commits to master branch, last one 3 years ago
7
46
gpl-3.0
1
A convenient Chinese word segmentation tool 简便中文分词器
Created 2021-11-10
161 commits to master branch, last one about a month ago