csebuetnlp / normalizer

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

Date Created 2021-09-06 (3 years ago)
Commits 7 (last one about a year ago)
Stargazers 35 (0 this week)
Watchers 4 (0 this week)
Forks 7
License unknown
Ranking

RepositoryStats indexes 634,026 repositories, of these csebuetnlp/normalizer is ranked #574,829 (9th percentile) for total stargazers, and #366,578 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #115,325/129,650.

Other Information

csebuetnlp/normalizer has 1 open pull request on Github, 1 pull request has been merged over the lifetime of the repository.

Star History

Github stargazers over time

353530302525202015151010550020222022Jul '22Jul '2220232023Jul '23Jul '2320242024Jul '24Jul '2420252025

Watcher History

Github watchers over time, collection started in '23

55554444443333Oct '24Oct '24Nov '24Nov '24Dec '24Dec '2420252025Feb '25Feb '25Mar '25Mar '25

Recent Commit History

3 commits on the default branch (main) since jan '22

332.52.5221.51.5110.50.500Jul '22Jul '2220232023Jul '23Jul '2320242024Jul '24Jul '2420252025

Yearly Commits

Commits to the default branch (main) per year

443.53.5332.52.5221.51.5110.50.500202120212022202220242024

Issue History

Total Issues
Open Issues
Closed Issues
1111110.50.5000000Jul '22Jul '2220232023Jul '23Jul '2320242024Jul '24Jul '2420252025

Languages

The only known language in this repository is Python

PythonPython

updated: 2025-01-16 @ 03:36pm, id: 403608524 / R_kgDOGA6TzA