Search Results - RepositoryStats

155

2.3k

unknown

20

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

llm ocr llama2 ai-assist tesseract ocr-correction

Created 2023-07-26

52 commits to main branch, last one 4 months ago

11

64

apache-2.0

4

Python 3 library for processing historical English

english nlp-library ocr-correction historical-data non-standard-data digital-humanities historical-english ocr-post-processing spelling-correction historical-linguistics spelling-normalization

Created 2019-03-20

65 commits to master branch, last one 4 months ago

5

36

mit

3

Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"

deep-learning ocr-correction sequence-to-sequence

Created 2021-07-29

42 commits to master branch, last one about a year ago