Trending repositories for topic pdf
Python tool for converting files and office documents to Markdown.
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
#1 Locally hosted web application that allows you to perform various operations on PDF files
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Open source DocuSign alternative. Create, fill, and sign digital documents ✍️
经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新
QuestPDF is a modern open-source .NET library for PDF document generation. Offering comprehensive layout engine powered by concise and discoverable C# Fluent API. Easily generate PDF reports, invoices...
Python tool for converting files and office documents to Markdown.
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
Zotero chat PDF with GPT, ChatGPT, Claude, Gemini
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.
A macOS native comic reader and image viewer [CBR CBZ RAR ZIP PDF] now updated and built for Intel and Apple Silicon Macs running 10.14 (Intel) or 11.4 (Apple Silicon) or later.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Embed malware, apks, executables or any other binary file into a PDF, or generate a PDF with malicious link encrusted.
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion
📖 The most advanced (yet simple) cli manga downloader in the entire universe! Lua scrapers, export formats, anilist integration, fancy TUI and more!
HTMLToQPDF is an extension for QuestPDF that allows to generate PDF from HTML
Python tool for converting files and office documents to Markdown.
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
#1 Locally hosted web application that allows you to perform various operations on PDF files
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新
Open source DocuSign alternative. Create, fill, and sign digital documents ✍️
Translate PDF, EPub, webpage, metadata, annotations, notes to the target language. Support 20+ translate services.
Python tool for converting files and office documents to Markdown.
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.
Zotero chat PDF with GPT, ChatGPT, Claude, Gemini
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
Learn how to create HTML/ZIP/PNG polyglot files in JavaScript
Base framework offering a Lua scriptable environment for creating document readers
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
🗜️【CAD转换命令行工具】支持从DXF、DWG转换成DXF、PDF、PNG、SVG格式 [cad2x] Minimal stand-alone CLI tool convert CAD files (DXF / DWG) to other formats (DXF / PDF / PNG / SVG)
Examples for the x-viewer SDK, which is a WebGL-based BIM model viewer built on Three.js and Vue3. It is used to view DWG/DXF/PDF files.
Completely local RAG. Chat with your PDF documents (with open LLM) and UI to that uses LangChain, Streamlit, Ollama (Llama 3.1), Qdrant and advanced methods like reranking and semantic chunking.
程序员免费编程书籍资源汇总,不定期分享编程开发相关的编程书籍、技术文章、学习资源、实用软件、互联网相关技术等。供各位小伙伴们学习参考,同时也方便自己提升,欢迎 Watch、Star。
Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.
Python tool for converting files and office documents to Markdown.
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
#1 Locally hosted web application that allows you to perform various operations on PDF files
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
Zotero chat PDF with GPT, ChatGPT, Claude, Gemini
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
📚 Biblioteca de livros essenciais da área da programação. (Confira o meu novo projeto `SendScriptWhatsapp`)
Kavita is a fast, feature rich, cross platform reading server. Built with the goal of being a full solution for all your reading needs. Setup your own server and share your reading collection with you...
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
A Symfony Bundle for interacting with Gotenberg. Integrates natively with twig, router, PHPStorm and more !
Learn how to create HTML/ZIP/PNG polyglot files in JavaScript
视频转图文并茂的pdf—videotopdf:打工人(会议记录)和学生党(网课笔记)等必备!使用地址:https://zjrwtxtechstudio-video-to-pdf.hf.space
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A PHP library for converting HTML to PDF using Google Chrome
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Build and generate PDF using React 📄 UI kit for PDFs and print documents. Simple, reusable components and templates to create great invoices, docs, brochures. Use your favorite front-end framework Re...
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
GUI analyzer for deep-diving into PDF files. Detect malicious payloads, understand object relationships, and extract key information for threat analysis.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
Material3 eBook reader - Book's Story. Built with Jetpack Compose. Free & Open Source & Ad Free. Lots of customization and supported file formats.
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
A python wrapper for the Doc2X API and comes with native texts processing (to improve PDF recall in RAG). | Doc2X API的python封装,同时附带本地的文本处理(提升PDF在RAG中的召回率)。
#1 Locally hosted web application that allows you to perform various operations on PDF files
Python tool for converting files and office documents to Markdown.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
📚 Biblioteca de livros essenciais da área da programação. (Confira o meu novo projeto `SendScriptWhatsapp`)
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Papermark is the open-source DocSend alternative with built-in analytics and custom domains.
🔥🔥超过1000本的计算机经典书籍、个人笔记资料以及本人在各平台发表文章中所涉及的资源等。书籍资源包括C/C++、Java、Python、Go语言、数据结构与算法、操作系统、后端架构、计算机系统知识、数据库、计算机网络、设计模式、前端、汇编以及校招社招各种面经~
Open source DocuSign alternative. Create, fill, and sign digital documents ✍️
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
GUI analyzer for deep-diving into PDF files. Detect malicious payloads, understand object relationships, and extract key information for threat analysis.
PDF++: The most Obsidian-native PDF annotation & viewing tool ever. Comes with optional Vim keybindings.
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion
View and Interact with PDFs in React, SolidJS, Svelte and JavaScript apps
A bot that accepts PDF docs and lets you ask questions on it.
Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.
SemanticPDF: Drag, Drop, Semantic Search - SemanticPDF is a simple, privacy-focused application that makes it easy to upload a PDF file and perform a semantic search on contents.
📚 Biblioteca de livros essenciais da área da programação. (Confira o meu novo projeto `SendScriptWhatsapp`)
pdf-frame is a web framework designed specifically for handling PDF and Canvas graphics requirements. It provides component support for popular frameworks like Vue, Nuxt and React. With its declarativ...
Viewer for the structure extracted by Grobid on PDF documents