99 results found Sort:
- Filter by Primary Language:
- Python (50)
- Jupyter Notebook (5)
- HTML (3)
- Perl (2)
- R (2)
- JavaScript (2)
- Macaulay2 (1)
- Ruby (1)
- TypeScript (1)
- Astro (1)
- XSLT (1)
- C++ (1)
- Common Lisp (1)
- DM (1)
- Go (1)
- Java (1)
- +
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Created
2019-02-08
75 commits to master branch, last one 5 months ago
A collection of small corpuses of interesting data for the creation of bots and similar stuff.
Created
2014-02-23
803 commits to master branch, last one 9 months ago
搜索所有中文NLP数据集,附常用英文NLP数据集
Created
2020-02-21
39 commits to master branch, last one about a year ago
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Created
2019-11-22
527 commits to master branch, last one 5 months ago
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Created
2016-12-08
80 commits to master branch, last one 7 months ago
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Created
2019-04-08
1,570 commits to master branch, last one 2 days ago
Deep Learning and deep reinforcement learning research papers and some codes
Created
2016-11-28
2,833 commits to master branch, last one 8 months ago
Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator
Created
2017-04-13
36 commits to master branch, last one 5 years ago
Awesome Chatbot Projects,Corpus,Papers,Tutorials.Chinese Chatbot =>:
Created
2017-09-01
11 commits to master branch, last one 7 years ago
用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Created
2017-03-14
9 commits to master branch, last one 7 years ago
A multilingual dialog corpus
Created
2017-01-11
396 commits to master branch, last one 4 years ago
公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Created
2018-10-10
46 commits to master branch, last one 7 months ago
非常全的文言文(古文)-现代文平行语料
Created
2022-01-11
28 commits to main branch, last one 6 months ago
:helicopter: 保险行业语料库,聊天机器人
Created
2017-07-26
8 commits to master branch, last one 3 months ago
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Created
2020-01-25
25 commits to master branch, last one 2 years ago
Collections of Chinese NLP corpus
Created
2018-12-28
21 commits to master branch, last one 3 years ago
ChatGPT 中文语料库 对话语料 小说语料 客服语料 用于训练大模型
Created
2023-04-26
31 commits to main branch, last one 5 months ago
An R package for the Quantitative Analysis of Textual Data
Created
2012-08-15
11,441 commits to master branch, last one about a month ago
Chatbot in 200 lines of code using TensorLayer
Created
2017-09-05
41 commits to master branch, last one 5 years ago
Crawl BookCorpus
Created
2018-07-14
50 commits to master branch, last one about a year ago
中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Created
2021-04-30
98 commits to main branch, last one about a year ago
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Created
2016-07-15
232 commits to master branch, last one about a year ago
Korean corpus repository
Created
2020-08-14
451 commits to master branch, last one 3 years ago
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Created
2018-08-23
1,450 commits to main branch, last one a day ago
❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库
Created
2020-04-22
7 commits to master branch, last one 3 months ago
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
Created
2020-03-31
63 commits to master branch, last one 2 years ago
[NeurlPS D&B 2024] Generative AI for Math: MathPile
Created
2023-11-27
33 commits to main branch, last one 10 days ago
chinese NLP corpus of chinese science fiction,chinese science fiction corpus : About 4675 Chinese science fiction novels 大约有4675本科幻小说,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Created
2020-07-31
5 commits to master branch, last one 2 years ago
KH Coder: for Quantitative Content Analysis or Text Mining
Created
2018-05-03
3,363 commits to master branch, last one 27 days ago
We gather Malaysian dataset! https://malaysian-dataset.readthedocs.io/
Created
2017-10-30
1,131 commits to master branch, last one 14 days ago