8 results found Sort:

248
1.8k
apache-2.0
49
Read and extract text and other content from PDFs in C# (port of PDFBox)
Created 2017-11-09
1,617 commits to master branch, last one 9 days ago
197
1.7k
gpl-3.0
76
A Gtk/Qt front-end to tesseract-ocr.
Created 2014-02-10
2,272 commits to master branch, last one 10 days ago
140
771
apache-2.0
27
OCR engine for all the languages
Created 2015-05-19
2,155 commits to main branch, last one 14 days ago
Document Layout Analysis resources repos for development with PdfPig.
Created 2019-09-02
181 commits to master branch, last one about a year ago
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
Created 2016-04-08
288 commits to master branch, last one 3 months ago
Conversions between various OCR formats
Created 2015-08-19
33 commits to master branch, last one about a year ago
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
Created 2015-11-25
93 commits to master branch, last one 6 months ago
Text Overlay plugin for Mirador 3
Created 2020-07-06
241 commits to main branch, last one 2 months ago