Statistics for topic ocr
RepositoryStats tracks 595,858 Github repositories, of these 604 are tagged with the ocr topic. The most common primary language for repositories using this topic is Python (291). Other languages include: Jupyter Notebook (42), C++ (41), TypeScript (31), Java (28), JavaScript (26), C# (25), Swift (11)
Stargazers over time for topic ocr
Most starred repositories for topic ocr (view more)
Trending repositories for topic ocr (view more)
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
A powerful command-line OCR tool built with Apple's Vision framework, supporting single image and batch processing with detailed positional information output.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
A powerful command-line OCR tool built with Apple's Vision framework, supporting single image and batch processing with detailed positional information output.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memor...
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
A ready-to-use, ready-to-go translation ocr tool developed by WPF/WPF 开发的一款即开即用、即用即走的翻译、OCR工具
OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memor...