21 results found Sort:

91
1.1k
unknown
70
curated collection of papers for the nlp practitioner 📖👩‍🔬
Created 2017-08-10
62 commits to master branch, last one 4 years ago
62
374
unknown
8
chinese NLP corpus of chinese science fiction,chinese science fiction corpus : About 4675 Chinese science fiction novels 大约有4675本科幻小说,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Created 2020-07-31
5 commits to master branch, last one 2 years ago
54
372
apache-2.0
20
multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
Created 2020-03-31
88 commits to master branch, last one 4 years ago
50
362
unknown
33
Открытые лингвистические датасеты: тональный словарь русского языка КартаСловСент, датасет по семантике, ассоциативный граф и датасет по орфографическим ошибкам и опечаткам.
Created 2017-03-31
20 commits to master branch, last one 3 years ago
Chinese, English NER, English-Chinese machine translation dataset. 中英文实体识别数据集,中英文机器翻译数据集, 中文分词数据集
Created 2018-06-08
21 commits to master branch, last one 3 years ago
22
255
cc-by-4.0
13
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Created 2021-01-19
114 commits to main branch, last one 10 months ago
17
177
unknown
3
a Fine-tuned LLaMA that is Good at Arithmetic Tasks
Created 2023-05-18
34 commits to main branch, last one about a year ago
19
173
unknown
11
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition (ACL 2020)
Created 2020-04-12
28 commits to master branch, last one 3 years ago
41
171
unknown
6
Implementation of Very Deep Convolutional Neural Network for Text Classification
Created 2017-08-29
26 commits to main branch, last one 3 years ago
A Constrained Text Generation Challenge Towards Generative Commonsense Reasoning
Created 2020-01-30
13 commits to master branch, last one 11 months ago
8
126
apache-2.0
10
🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Paper.
Created 2024-06-23
78 commits to main branch, last one 21 days ago
22
99
unknown
3
chinese NLP corpus of chinese science fiction, chinese science fiction corpus: Archive of the Ark Plan of Ula Science Fiction Website 乌拉科幻小说网方舟计划存档,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Created 2020-05-04
8 commits to master branch, last one 2 years ago
Yorùbá language training text for NLP, ASR and TTS tasks
Created 2018-01-17
91 commits to master branch, last one about a year ago
The release of the FreebaseQA data set (NAACL 2019).
Created 2017-12-21
11 commits to master branch, last one 2 years ago
Code and data for "Summarising Historical Text in Modern Languages" (EACL 2021)
Created 2021-01-06
14 commits to main branch, last one 3 years ago
A collection of datasets for Ukrainian language
Created 2021-03-23
195 commits to main branch, last one 5 months ago
AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/
Created 2022-07-09
211 commits to main branch, last one 11 months ago
WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000+ "why" question-answer-rationale triplets.
Created 2022-10-24
12 commits to main branch, last one about a year ago
9
39
other
3
Comprehensive evaluation framework for Open Information Extraction.
Created 2021-09-14
16 commits to main branch, last one 2 years ago
a small test dataset for use with OpenAI's ChatGPT
Created 2023-01-13
10 commits to main branch, last one about a year ago
A list of Romanian NLP Datasets
Created 2023-05-23
45 commits to main branch, last one 2 months ago