1 result found Sort:

13
65
gpl-3.0
13
Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to...
Created 2013-03-26
1,579 commits to master branch, last one 4 days ago