11 results found Sort:
- Filter by Primary Language:
- Python (8)
- HTML (1)
- Macaulay2 (1)
- TypeScript (1)
- +
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Created
2019-04-08
1,587 commits to master branch, last one a day ago
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Created
2018-08-23
1,453 commits to main branch, last one 20 days ago
Bitextor generates translation memories from multilingual websites
Created
2018-04-16
4,001 commits to master branch, last one about a month ago
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Created
2021-01-19
114 commits to main branch, last one 10 months ago
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
Created
2021-01-18
233 commits to main branch, last one about a month ago
Python library for handling audio datasets.
Created
2017-11-27
600 commits to master branch, last one 4 years ago
OpusFilter - Parallel corpus processing toolkit
Created
2019-11-06
268 commits to develop branch, last one 4 months ago
Utilities for Processing the Switchboard Dialogue Act Corpus
Created
2018-11-14
48 commits to master branch, last one 3 years ago
An advanced, extensible web front-end for the Manatee-open corpus search engine
Created
2015-04-14
12,928 commits to master branch, last one 11 days ago
SpeCT - Speech Corpus Toolkit for Praat. Documentation: https://lennes.github.io/spect/
Created
2017-03-08
49 commits to master branch, last one about a year ago
A parser for annotated MuseScore 3 files.
Created
2020-05-04
1,593 commits to main branch, last one 2 months ago