4 results found Sort:

238
3.2k
apache-2.0
30
Python & command-line tool to gather text on the Web: Crawling & scraping, content extraction, metadata. TXT, Markdown, CSV & XML output.
Created 2019-04-08
1,522 commits to master branch, last one 4 days ago
Document Layout Analysis resources repos for development with PdfPig.
Created 2019-09-02
181 commits to master branch, last one 9 months ago
hand-written dictionaries from the FreeDict project
Created 2015-08-05
1,722 commits to master branch, last one 7 months ago
The main TEI Publisher app
Created 2020-06-03
2,778 commits to master branch, last one about a month ago