Trending repositories for topic ocr
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
ShareX is a free and open source program that lets you capture or record any area of your screen and share it with a single press of a key. It also allows uploading images, text or other types of file...
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post-processing, and convert the models to ONNX.
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
A FLOSS software for Persian Optical Character Recognition
LLM Agent Framework in ComfyUI includes Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces, such as o1...
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A toolbox for deep learning model deployment using C++ YoloX | YoloV7 | YoloV8 | Gan | OCR | MobileVit | Scrfd | MobileSAM | StableDiffusion
ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generation...
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
ShareX is a free and open source program that lets you capture or record any area of your screen and share it with a single press of a key. It also allows uploading images, text or other types of file...
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
一个简洁优雅的词典翻译 macOS App。开箱即用,支持离线 OCR 识别,支持有道词典,🍎 苹果系统词典,🍎 苹果系统翻译,OpenAI,Gemini,DeepL,Google,Bing,腾讯,百度,阿里,小牛,彩云和火山翻译。A concise and elegant Dictionary and Translator macOS App for looking up words and...
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post-processing, and convert the models to ONNX.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
A command-line application to convert images, PDFs, and audio files to text using Apple's APIs
A toolbox for deep learning model deployment using C++ YoloX | YoloV7 | YoloV8 | Gan | OCR | MobileVit | Scrfd | MobileSAM | StableDiffusion
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
LLM Agent Framework in ComfyUI includes Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces, such as o1...
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
Table detection (TD) and table structure recognition (TSR) using Yolov5/Yolov8, cand you can get the same (even better) result compared with Table Transformer (TATR) with smaller models.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.
OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
A powerful command-line OCR tool built with Apple's Vision framework, supporting single image and batch processing with detailed positional information output.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
ShareX is a free and open source program that lets you capture or record any area of your screen and share it with a single press of a key. It also allows uploading images, text or other types of file...
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
一个简洁优雅的词典翻译 macOS App。开箱即用,支持离线 OCR 识别,支持有道词典,🍎 苹果系统词典,🍎 苹果系统翻译,OpenAI,Gemini,DeepL,Google,Bing,腾讯,百度,阿里,小牛,彩云和火山翻译。A concise and elegant Dictionary and Translator macOS App for looking up words and...
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
A powerful command-line OCR tool built with Apple's Vision framework, supporting single image and batch processing with detailed positional information output.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post-processing, and convert the models to ONNX.
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
🤖 This repository houses a collection of image classification models for various purposes, including vehicle, object, animal, and flower classification. Each classifier is built using deep learning t...
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR
Efficient OCR engine for receipt image processing using Python, FastAPI, and Tesseract
SubtitleOCR(望言OCR) is a fast tool for hardcode video subtitle extraction.
📦 Out-Of-The-Box & Powerful File Parsing Service, support Text/Pdf/Docx/Pptx/Xlsx/Image/Audio parsing, support OCR, support Base64/Local/S3/R2/TG/MinIO storage.
Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memor...
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
LLM Agent Framework in ComfyUI includes Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfaces, such as o1...
MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generation...
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
ID Card Recognition SDK which can recognize ID cards, Passports and Drive License from 200+ countries
A python wrapper for the Doc2X API and comes with native texts processing (to improve PDF recall in RAG). | Doc2X API的python封装,同时附带本地的文本处理(提升PDF在RAG中的召回率)。
an OCR tool to translate Old Persian cuneiform (Achaemenid language) by AI
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...
Tesseract Open Source OCR Engine (main repository)
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SF...
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
pix2tex: Using a ViT to convert images of equations into LaTeX code.
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
ShareX is a free and open source program that lets you capture or record any area of your screen and share it with a single press of a key. It also allows uploading images, text or other types of file...
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
一个简洁优雅的词典翻译 macOS App。开箱即用,支持离线 OCR 识别,支持有道词典,🍎 苹果系统词典,🍎 苹果系统翻译,OpenAI,Gemini,DeepL,Google,Bing,腾讯,百度,阿里,小牛,彩云和火山翻译。A concise and elegant Dictionary and Translator macOS App for looking up words and...
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
截屏 离线OCR 搜索翻译 以图搜图 贴图 录屏 万向滚动截屏 屏幕翻译 Screenshot Offline OCR Search Translate Search for picture Paste the picture on the screen Screen recorder Omnidirectional scrolling screenshot Sc...
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a variety of formats (Image, Pdf, Epub, cbr, cbz, etc) and in multiple languages.
A ready-to-use, ready-to-go translation ocr tool developed by WPF/WPF 开发的一款即开即用、即用即走的翻译、OCR工具
OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memor...
AI-powered organization and chat assistant for Obsidian
A package for parsing PDFs and analyzing their content using LLMs.
整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post-processing, and convert the models to ONNX.
Easy-to-Use Apple Vision wrapper for text extraction, scalar representation and clustering using K-means.
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
Web interface for recognizing text, proofreading OCR, and creating fully-digitized documents.
OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes
JavaVision是一个基于Java开发的全能视觉智能识别项目。该项目起源于对图像处理和人工智能领域的热情,以及对Java作为主要编程语言的坚持。在AI领域,大多数解决方案都是使用Python实现的,因此决定充分利用Java的优势来构建一个功能强大且易于集成的视觉智能识别平台。
Rust library and CLI tool for OCR (extracting text from images)
Parse vision is an open source tool to visualise what OCR is parsing in a PDF document to help developers and product teams identify if the parsing has missed some vital information from the document.
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
(Pattern Recognition) Pytorch implementation of “HTR-VT: Handwritten Text Recognition with Vision Transformer”