10 results found Sort:

450
4.6k
apache-2.0
72
A Unified Toolkit for Deep Learning Based Document Image Analysis
Created 2020-06-10
182 commits to main branch, last one about a year ago
225
1.6k
apache-2.0
45
Read and extract text and other content from PDFs in C# (port of PDFBox)
Created 2017-11-09
1,556 commits to master branch, last one 2 days ago
153
1.5k
mit
12
An Open-Source Python3 tool for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless convers...
Created 2022-09-07
452 commits to main branch, last one 5 days ago
123
672
apache-2.0
25
OCR engine for all the languages
Created 2015-05-19
2,113 commits to main branch, last one 4 days ago
Document Layout Analysis resources repos for development with PdfPig.
Created 2019-09-02
181 commits to master branch, last one 9 months ago
44
174
apache-2.0
14
A toolbox of OCR models, algorithms, and pipelines based on MindSpore
Created 2022-12-20
821 commits to main branch, last one 2 days ago
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
Created 2022-04-28
53 commits to master branch, last one about a year ago
10
71
apache-2.0
1
An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"
Created 2022-12-23
38 commits to main branch, last one 8 months ago
[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)
Created 2023-04-30
14 commits to main branch, last one 8 months ago