Search Results - RepositoryStats

59

408

cc-by-sa-4.0

34

Resources for conservation, development, and documentation of low resource (human) languages.

nlp list lrls awesome awesome-list human-language natural-language language-learning minority-language language-resources resourced-languages endangered-languages language-documentation low-resource-languages natural-language-processing

Created 2014-07-23

428 commits to master branch, last one 11 months ago

xl-sum csebuetnlp

42

264

unknown

6

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Compu...

dataset multilingual deep-learning multilinguality machine-learning text-summarisation text-summarization summarization-corpora summarization-dataset low-resource-languages text-summarization-model abstractive-summarization low-resource-summarization multilingual-summarization text-summarization-dataset abstractive-text-summarization low-resource-text-summarizarion multilingual-text-summarization

Created 2021-06-26

17 commits to master branch, last one 12 months ago

banglanmt csebuetnlp

46

147

unknown

9

This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Pr...

bangla-nlp emnlp-2020 parallel-corpus low-resource-nlp parallel-corpora machine-translation low-resource-languages bangla-machine-translation neural-machine-translation low-resource-machine-translation bangla-dataset-machine-translation

Created 2020-10-05

22 commits to master branch, last one 5 months ago

GlotLID cisnlp

8

122

apache-2.0

5

💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023

lid glot glotcc langid glotlid multlingual low-resource-nlp language-detector language-detection language-identifier language-recognition language-detection-lib low-resource-languages language-classification language-identification language-detection-library language-identification-toolkit

Created 2023-09-26

17 commits to main branch, last one 3 months ago

africanlp-public-datasets Andrews2017

23

103

unknown

6

A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.

datasets african-languages low-resource-languages natural-language-processing

Created 2021-05-01

49 commits to main branch, last one 11 months ago

Filipino-Text-Benchmarks jcblaisecruz02

9

61

gpl-3.0

3

Open-source benchmark datasets and pretrained transformer models in the Filipino language.

nli bert corpus electra tagalog filipino benchmark transformer deep-learning electra-models transfer-learning text-classification tagalog-transformers low-resource-languages

Created 2020-05-04

68 commits to master branch, last one 7 months ago

Turkish-Text-to-Speech Rumeysakeskin

5

55

unknown

5

Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan

tts hifigan pytorch fastpitch nvidia-nemo nvidia-docker speech-synthesis waveform-generator phonetical-conversion spectrogram-generator low-resource-languages turkish-text-to-speech

Created 2022-12-02

95 commits to main branch, last one about a year ago