Statistics for topic ocr
RepositoryStats tracks 608,294 Github repositories, of these 619 are tagged with the ocr topic. The most common primary language for repositories using this topic is Python (298). Other languages include: C++ (43), Jupyter Notebook (42), TypeScript (32), Java (28), JavaScript (27), C# (26), HTML (11), Swift (11)
Stargazers over time for topic ocr
Most starred repositories for topic ocr (view more)
Trending repositories for topic ocr (view more)
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
快如闪电的硬字幕提取工具。仅需苹果M1芯片或英伟达3060显卡即可达到10倍速提取。A very fast tool for video hardcode subtitle extraction
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models.
MRZ Passport Reader from Image is a Python-based tool that automatically detects, segments, and extracts text from the Machine-Readable Zone (MRZ) of passport images. Utilizing deep learning models fo...
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
快如闪电的硬字幕提取工具。仅需苹果M1芯片或英伟达3060显卡即可达到10倍速提取。A very fast tool for video hardcode subtitle extraction
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
快如闪电的硬字幕提取工具。仅需苹果M1芯片或英伟达3060显卡即可达到10倍速提取。A very fast tool for video hardcode subtitle extraction
ocr-docker is small, Flask powerd web app, helps us to extract text from images and pdf document using OCR
身份证OCR智能识别、证件提取以及验证码自动化解析功能,项目核心基于深度学习技术。模型、数据集、finetune和API支持,欢迎自取使用,并持续关注我输出的更多模型。V:chenganp
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSO...
OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memor...
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSO...
OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memor...
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
A package for parsing PDFs and analyzing their content using LLMs.