10 results found Sort:
- Filter by Primary Language:
- Python (6)
- Jupyter Notebook (1)
- TeX (1)
- +
Resources for conservation, development, and documentation of low resource (human) languages.
Created
2014-07-23
428 commits to master branch, last one 9 months ago
This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Compu...
dataset
multilingual
deep-learning
multilinguality
machine-learning
text-summarisation
text-summarization
summarization-corpora
summarization-dataset
low-resource-languages
text-summarization-model
abstractive-summarization
low-resource-summarization
multilingual-summarization
text-summarization-dataset
abstractive-text-summarization
low-resource-text-summarizarion
multilingual-text-summarization
Created
2021-06-26
17 commits to master branch, last one 10 months ago
This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Pr...
Created
2020-10-05
22 commits to master branch, last one 3 months ago
Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
Created
2023-09-26
17 commits to main branch, last one 2 months ago
A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.
Created
2021-05-01
49 commits to main branch, last one 9 months ago
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
Created
2020-05-04
68 commits to master branch, last one 5 months ago
Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan
Created
2022-12-02
95 commits to main branch, last one about a year ago
NLP pipelines for Tagalog using spaCy
Created
2022-10-17
279 commits to master branch, last one a day ago
CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates
Created
2019-07-28
76 commits to master branch, last one about a year ago
SemEval2024-task 11: Bridging the Gap in Text-Based Emotion Detection
Created
2024-06-04
485 commits to main branch, last one 2 days ago