17 results found Sort:

1.1k
1.6k
unknown
44
NLTK Data
Created 2012-05-10
392 commits to gh-pages branch, last one 10 hours ago
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Created 2018-09-01
257 commits to master branch, last one 2 months ago
138
1.0k
lgpl-2.1
39
Data repository for pretrained NLP models and NLP corpora.
Created 2017-10-13
75 commits to master branch, last one 6 years ago
A collaborative catalog of NLP resources for Indic languages
Created 2019-08-29
286 commits to master branch, last one 2 months ago
21
291
mit
18
Links to Russian corpora + Python functions for loading and parsing
Created 2019-04-26
171 commits to master branch, last one about a year ago
21
257
apache-2.0
27
Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).
Created 2021-06-17
59 commits to main branch, last one about a year ago
A web-based engine for creating and annotating textual corpora
Created 2014-03-27
2,786 commits to master branch, last one about a year ago
Open Korean NLP Dataset Curation for the Users All Around the Globe
Created 2020-10-18
26 commits to master branch, last one about a year ago
25
126
mit
4
CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)
Created 2020-12-08
35 commits to main branch, last one 4 years ago
25
106
bsd-3-clause
12
The Self-dialogue Corpus - a collection of self-dialogues across music, movies and sports
Created 2017-11-04
24 commits to main branch, last one 11 months ago
Unannotated Spanish 3 Billion Words Corpora
Created 2019-05-13
12 commits to master branch, last one 2 years ago
A curated list of resources dedicated to Natural Language Processing (NLP) of Cantonese | 粵語 NLP
Created 2020-01-08
16 commits to main branch, last one 3 years ago
An R package for dynamic exploration of text collections
Created 2019-03-17
585 commits to master branch, last one 5 months ago
22
64
gpl-2.0
11
An advanced, extensible web front-end for the Manatee-open corpus search engine
Created 2015-04-14
12,993 commits to master branch, last one a day ago
The Official Repository for 👉 CCAE: A Corpus of Chinese-based Asian Englishes @ NLPCC 2023
Created 2023-10-08
11 commits to main branch, last one about a year ago
12
47
unknown
9
Named Entity Recognition for biomedical entities
Created 2019-02-11
57 commits to master branch, last one 2 years ago
A comprehensive list of annotated training datasets classified by use case.
Created 2022-05-25
205 commits to main branch, last one 2 years ago