csebuetnlp / normalizer
This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.
RepositoryStats indexes 589,134 repositories, of these csebuetnlp/normalizer is ranked #540,631 (8th percentile) for total stargazers, and #374,926 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #106,069/117,584.
csebuetnlp/normalizer has 1 open pull request on Github, 1 pull request has been merged over the lifetime of the repository.
Star History
Github stargazers over time
Watcher History
Github watchers over time, collection started in '23
Recent Commit History
3 commits on the default branch (main) since jan '22
Yearly Commits
Commits to the default branch (main) per year
Issue History
Languages
The only known language in this repository is Python
updated: 2024-09-06 @ 09:40am, id: 403608524 / R_kgDOGA6TzA