3 results found Sort:

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
Created 2023-07-26
52 commits to main branch, last one 3 months ago
11
64
apache-2.0
4
Python 3 library for processing historical English
Created 2019-03-20
65 commits to master branch, last one 3 months ago
Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"
Created 2021-07-29
42 commits to master branch, last one 11 months ago