Trending repositories for topic pdf

Last 3 days (new repositories)

no newly created repositories trending in the last 3 days

Last 3 days (absolute gain)

microsoft/markitdown

Python tool for converting files and office documents to Markdown.

25,619 (+7,011)

mit

Byaidu/PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/Docker

10,057 (+966)

agpl-3.0

DS4SD/docling

Get your documents ready for gen AI

15,959 (+594)

mit

opendatalab/MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

21,669 (+218)

agpl-3.0

justjavac/free-programming-books-zh_CN

:books: 免费的计算机编程类中文书籍，欢迎投稿

112,172 (+120)

gpl-3.0

paperless-ngx/paperless-ngx

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

23,117 (+99)

gpl-3.0

Stirling-Tools/Stirling-PDF

#1 Locally hosted web application that allows you to perform various operations on PDF files

47,520 (+89)

mit

QuivrHQ/MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

4,594 (+88)

apache-2.0

getomni-ai/zerox

PDF to Markdown with vision models

7,086 (+83)

mit

koodo-reader/koodo-reader

A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web

19,979 (+76)

agpl-3.0

siyuan-note/siyuan

A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.

23,607 (+69)

agpl-3.0

papersgpt/papersgpt-for-zotero

Zotero chat PDF with GPT, ChatGPT, Claude, Gemini

614 (+64)

agpl-3.0

superlinear-ai/raglite

🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

604 (+63)

mpl-2.0

Unstructured-IO/unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

9,511 (+47)

apache-2.0

koreader/koreader

An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices

17,386 (+46)

agpl-3.0

0voice/expert_readed_books

2021年最新总结，推荐工程师合适读本，计算机科学，软件技术，创业，思想类，数学类，人物传记书籍

9,113 (+43)

pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

5,994 (+42)

agpl-3.0

docusealco/docuseal

Open source DocuSign alternative. Create, fill, and sign digital documents ✍️

8,188 (+37)

agpl-3.0

hehonghui/awesome-english-ebooks

经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新

22,474 (+37)

QuestPDF/QuestPDF

QuestPDF is a modern open-source .NET library for PDF document generation. Offering comprehensive layout engine powered by concise and discoverable C# Fluent API. Easily generate PDF reports, invoices...

12,230 (+30)

Last 3 days (relative gain)

microsoft/markitdown

Python tool for converting files and office documents to Markdown.

25,619 (+38%)

mit

addyosmani/scan

Free document to PDF scanner

31 (+15%)

superlinear-ai/raglite

🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

604 (+12%)

mpl-2.0

papersgpt/papersgpt-for-zotero

Zotero chat PDF with GPT, ChatGPT, Claude, Gemini

614 (+12%)

agpl-3.0

Byaidu/PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/Docker

10,057 (+11%)

agpl-3.0

mrmn2/PdfDing

Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.

91 (+10%)

gpl-3.0

DS4SD/docling

Get your documents ready for gen AI

15,959 (+4%)

mit

explosion/spacy-layout

📚 Process PDFs, Word documents and more with spaCy

255 (+4%)

mit

chuchusoft/Sequential

A macOS native comic reader and image viewer [CBR CBZ RAR ZIP PDF] now updated and built for Intel and Apple Silicon Macs running 10.14 (Intel) or 11.4 (Apple Silicon) or later.

37 (+3%)

yobix-ai/extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

620 (+3%)

apache-2.0

e-m3din4/booby-trap-pdf

Embed malware, apks, executables or any other binary file into a PDF, or generate a PDF with malicious link encrusted.

42 (+2%)

mit

drudge/n8n-nodes-puppeteer

n8n node for browser automation using Puppeteer

103 (+2%)

mit

QuivrHQ/MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

4,594 (+2%)

apache-2.0

chinapandaman/PyPDFForm

:fire: The Python library for PDF forms.

475 (+1%)

mit

enoch3712/ExtractThinker

ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.

477 (+1%)

apache-2.0

jimmc414/1filellm

Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion

641 (+1%)

mit

metafates/mangal

📖 The most advanced (yet simple) cli manga downloader in the entire universe! Lua scrapers, export formats, anilist integration, fancy TUI and more!

1,478 (+1%)

mit

Relorer/HTMLToQPDF

HTMLToQPDF is an extension for QuestPDF that allows to generate PDF from HTML

92 (+1%)

mit

sungaila/PDFtoImage

A .NET library to render PDF files into images.

199 (+1%)

mit

cloudcommunity/Free-Books

Free eBooks about cloud computing and related topics

105 (+1.0%)

mit

Last week (new repositories)

no newly created repositories trending in the last week

Last week (absolute gain)

microsoft/markitdown

Python tool for converting files and office documents to Markdown.

25,619 (+23,028)

mit

Byaidu/PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/Docker

10,057 (+3,797)

agpl-3.0

DS4SD/docling

Get your documents ready for gen AI

15,959 (+1,772)

mit

opendatalab/MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

21,669 (+464)

agpl-3.0

QuivrHQ/MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

4,594 (+393)

apache-2.0

Stirling-Tools/Stirling-PDF

#1 Locally hosted web application that allows you to perform various operations on PDF files

47,520 (+196)

mit

paperless-ngx/paperless-ngx

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

23,117 (+183)

gpl-3.0

justjavac/free-programming-books-zh_CN

:books: 免费的计算机编程类中文书籍，欢迎投稿

112,172 (+166)

gpl-3.0

koodo-reader/koodo-reader

A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web

19,979 (+154)

agpl-3.0

getomni-ai/zerox

PDF to Markdown with vision models

7,086 (+148)

mit

siyuan-note/siyuan

A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.

23,607 (+111)

agpl-3.0

pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

5,994 (+107)

agpl-3.0

0voice/expert_readed_books

2021年最新总结，推荐工程师合适读本，计算机科学，软件技术，创业，思想类，数学类，人物传记书籍

9,113 (+103)

papersgpt/papersgpt-for-zotero

Zotero chat PDF with GPT, ChatGPT, Claude, Gemini

614 (+95)

agpl-3.0

superlinear-ai/raglite

🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

604 (+81)

mpl-2.0

koreader/koreader

An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices

17,386 (+80)

agpl-3.0

Unstructured-IO/unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

9,511 (+78)

apache-2.0

hehonghui/awesome-english-ebooks

经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新

22,474 (+69)

docusealco/docuseal

Open source DocuSign alternative. Create, fill, and sign digital documents ✍️

8,188 (+62)

agpl-3.0

windingwind/zotero-pdf-translate

Translate PDF, EPub, webpage, metadata, annotations, notes to the target language. Support 20+ translate services.

7,801 (+54)

agpl-3.0

Last week (relative gain)

microsoft/markitdown

Python tool for converting files and office documents to Markdown.

25,619 (+889%)

mit

addyosmani/scan

Free document to PDF scanner

31 (+72%)

Byaidu/PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/Docker

10,057 (+61%)

agpl-3.0

mrmn2/PdfDing

Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.

91 (+21%)

gpl-3.0

papersgpt/papersgpt-for-zotero

Zotero chat PDF with GPT, ChatGPT, Claude, Gemini

614 (+18%)

agpl-3.0

superlinear-ai/raglite

🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

604 (+15%)

mpl-2.0

gildas-lormeau/Polyglot-HTML-ZIP-PNG

Learn how to create HTML/ZIP/PNG polyglot files in JavaScript

36 (+13%)

mit

DS4SD/docling

Get your documents ready for gen AI

15,959 (+12%)

mit

koreader/koreader-base

Base framework offering a Lua scriptable environment for creating document readers

155 (+12%)

agpl-3.0

QuivrHQ/MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

4,594 (+9%)

apache-2.0

explosion/spacy-layout

📚 Process PDFs, Word documents and more with spaCy

255 (+8%)

mit

Melanee-Melanee/OCR-on-PDF

OCR on unsearchable and large PDF file

56 (+6%)

spawnia/md-to-pdf

A web service for converting markdown to PDF

99 (+5%)

mit

yobix-ai/extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

620 (+5%)

apache-2.0

orcastor/cad2x-converter

🗜️【CAD转换命令行工具】支持从DXF、DWG转换成DXF、PDF、PNG、SVG格式 [cad2x] Minimal stand-alone CLI tool convert CAD files (DXF / DWG) to other formats (DXF / PDF / PNG / SVG)

79 (+4%)

lgpl-2.1

egghunters/dxf-viewer-examples

Examples for the x-viewer SDK, which is a WebGL-based BIM model viewer built on Three.js and Vue3. It is used to view DWG/DXF/PDF files.

27 (+4%)

chinapandaman/PyPDFForm

:fire: The Python library for PDF forms.

475 (+4%)

mit

shanedemorais/pinescript_v5_user_manual_pdfs

Pine Script V5 online manual as PDF files.

30 (+3%)

curiousily/ragbase

Completely local RAG. Chat with your PDF documents (with open LLM) and UI to that uses LangChain, Streamlit, Ollama (Llama 3.1), Qdrant and advanced methods like reranking and semantic chunking.

65 (+3%)

mit

iGaoWei/CoderBooks

程序员免费编程书籍资源汇总，不定期分享编程开发相关的编程书籍、技术文章、学习资源、实用软件、互联网相关技术等。供各位小伙伴们学习参考，同时也方便自己提升，欢迎 Watch、Star。

70 (+3%)

Last month (new repositories)

papersgpt/papersgpt-for-zotero

Zotero chat PDF with GPT, ChatGPT, Claude, Gemini

614

agpl-3.0

mrmn2/PdfDing

Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.

gpl-3.0

Melanee-Melanee/OCR-on-PDF

OCR on unsearchable and large PDF file

addyosmani/scan

Free document to PDF scanner

StabRise/spark-pdf

PDF DataSource for Apache Spark

agpl-3.0

Last month (absolute gain)

microsoft/markitdown

Python tool for converting files and office documents to Markdown.

25,619 (+25,617)

mit

Byaidu/PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/Docker

10,057 (+7,730)

agpl-3.0

DS4SD/docling

Get your documents ready for gen AI

15,959 (+5,748)

mit

QuivrHQ/MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

4,594 (+3,909)

apache-2.0

opendatalab/MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

21,669 (+3,852)

agpl-3.0

koodo-reader/koodo-reader

A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web

19,979 (+1,191)

agpl-3.0

0voice/expert_readed_books

2021年最新总结，推荐工程师合适读本，计算机科学，软件技术，创业，思想类，数学类，人物传记书籍

9,113 (+1,129)

Stirling-Tools/Stirling-PDF

#1 Locally hosted web application that allows you to perform various operations on PDF files

47,520 (+1,111)

mit

paperless-ngx/paperless-ngx

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

23,117 (+1,035)

gpl-3.0

siyuan-note/siyuan

A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.

23,607 (+690)

agpl-3.0

papersgpt/papersgpt-for-zotero

Zotero chat PDF with GPT, ChatGPT, Claude, Gemini

614 (+613)

agpl-3.0

getomni-ai/zerox

PDF to Markdown with vision models

7,086 (+590)

mit

superlinear-ai/raglite

🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

604 (+578)

mpl-2.0

koreader/koreader

An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices

17,386 (+499)

agpl-3.0

yobix-ai/extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

620 (+481)

apache-2.0

justjavac/free-programming-books-zh_CN

:books: 免费的计算机编程类中文书籍，欢迎投稿

112,172 (+400)

gpl-3.0

KAYOKG/BibliotecaDev

📚 Biblioteca de livros essenciais da área da programação. (Confira o meu novo projeto `SendScriptWhatsapp`)

6,161 (+387)

mit

Kareadita/Kavita

Kavita is a fast, feature rich, cross platform reading server. Built with the goal of being a full solution for all your reading needs. Setup your own server and share your reading collection with you...

6,759 (+365)

gpl-3.0

Unstructured-IO/unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

9,511 (+325)

apache-2.0

pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

5,994 (+311)

agpl-3.0

Last month (relative gain)

superlinear-ai/raglite

🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

604 (+2,223%)

mpl-2.0

mrmn2/PdfDing

Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.

91 (+1,720%)

gpl-3.0

QuivrHQ/MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

4,594 (+571%)

apache-2.0

Melanee-Melanee/OCR-on-PDF

OCR on unsearchable and large PDF file

56 (+367%)

yobix-ai/extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

620 (+346%)

apache-2.0

Byaidu/PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/Docker

10,057 (+332%)

agpl-3.0

explosion/spacy-layout

📚 Process PDFs, Word documents and more with spaCy

255 (+174%)

mit

windingwind/bionic-for-zotero

Bionic reading experience with Zotero.

88 (+87%)

agpl-3.0

addyosmani/scan

Free document to PDF scanner

31 (+72%)

sensiolabs/GotenbergBundle

A Symfony Bundle for interacting with Gotenberg. Integrates natively with twig, router, PHPStorm and more !

62 (+63%)

mit

DS4SD/docling

Get your documents ready for gen AI

15,959 (+56%)

mit

gildas-lormeau/Polyglot-HTML-ZIP-PNG

Learn how to create HTML/ZIP/PNG polyglot files in JavaScript

36 (+38%)

mit

zjrwtx/videotopdf_ui

视频转图文并茂的pdf—videotopdf：打工人（会议记录）和学生党（网课笔记）等必备！使用地址：https://zjrwtxtechstudio-video-to-pdf.hf.space

28 (+33%)

CycloneBoy/pdf_table

A Unified Toolkit for Deep Learning-Based Table Extraction

26 (+30%)

enoch3712/ExtractThinker

ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.

477 (+27%)

apache-2.0

OnedocLabs/pdfreader

Easy Radix-Style PDF Viewer for React.

63 (+26%)

mit

opendatalab/MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

21,669 (+22%)

agpl-3.0

spiritix/php-chrome-html2pdf

A PHP library for converting HTML to PDF using Google Chrome

137 (+21%)

mit

drudge/n8n-nodes-puppeteer

n8n node for browser automation using Puppeteer

103 (+20%)

mit

HiIamChaitanya/pdf-flipbook

PDF Flipbook

80 (+19%)

mit

Last 12-months (new repositories)

microsoft/markitdown

Python tool for converting files and office documents to Markdown.

25,619

mit

opendatalab/MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

21,669

agpl-3.0

DS4SD/docling

Get your documents ready for gen AI

15,959

mit

Byaidu/PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/Docker

10,057

agpl-3.0

getomni-ai/zerox

PDF to Markdown with vision models

7,086

mit

QuivrHQ/MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

4,594

apache-2.0

run-llama/llama_parse

Parse files for optimal RAG

3,402

mit

OnedocLabs/react-print-pdf

Build and generate PDF using React 📄 UI kit for PDFs and print documents. Simple, reusable components and templates to create great invoices, docs, brochures. Use your favorite front-end framework Re...

2,339

apache-2.0

CatchTheTornado/pdf-extract-api

Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

1,540

gpl-3.0

emcf/thepipe

Extract clean data from anywhere, powered by vision-language models ⚡

1,196

mit

seekbytes/IPA

GUI analyzer for deep-diving into PDF files. Detect malicious payloads, understand object relationships, and extract key information for threat analysis.

807

gpl-2.0

spatie/laravel-pdf

Create PDF files in Laravel apps

738

mit

yobix-ai/extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

620

apache-2.0

papersgpt/papersgpt-for-zotero

Zotero chat PDF with GPT, ChatGPT, Claude, Gemini

614

agpl-3.0

superlinear-ai/raglite

🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

604

mpl-2.0

Acclorite/book-story

Material3 eBook reader - Book's Story. Built with Jetpack Compose. Free & Open Source & Ad Free. Lots of customization and supported file formats.

487

gpl-3.0

enoch3712/ExtractThinker

ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.

477

apache-2.0

ArtifexSoftware/mupdf.js

JavaScript bindings for MuPDF

430

agpl-3.0

explosion/spacy-layout

📚 Process PDFs, Word documents and more with spaCy

255

mit

Menghuan1918/pdfdeal

A python wrapper for the Doc2X API and comes with native texts processing (to improve PDF recall in RAG). | Doc2X API的python封装，同时附带本地的文本处理(提升PDF在RAG中的召回率)。

206

mit

Last 12-months (absolute gain)

Stirling-Tools/Stirling-PDF

#1 Locally hosted web application that allows you to perform various operations on PDF files

47,520 (+42,344)

mit

microsoft/markitdown

Python tool for converting files and office documents to Markdown.

25,619 (+25,617)

mit

opendatalab/MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

21,669 (+21,668)

agpl-3.0

DS4SD/docling

Get your documents ready for gen AI

15,959 (+15,950)

mit

paperless-ngx/paperless-ngx

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

23,117 (+10,061)

gpl-3.0

Byaidu/PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/Docker

10,057 (+10,056)

agpl-3.0

siyuan-note/siyuan

A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.

23,607 (+9,958)

agpl-3.0

hehonghui/awesome-english-ebooks

经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新

22,474 (+7,668)

getomni-ai/zerox

PDF to Markdown with vision models

7,086 (+7,075)

mit

koodo-reader/koodo-reader

A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web

19,979 (+6,186)

agpl-3.0

justjavac/free-programming-books-zh_CN

:books: 免费的计算机编程类中文书籍，欢迎投稿

112,172 (+5,751)

gpl-3.0

Unstructured-IO/unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

9,511 (+5,751)

apache-2.0

KAYOKG/BibliotecaDev

📚 Biblioteca de livros essenciais da área da programação. (Confira o meu novo projeto `SendScriptWhatsapp`)

6,161 (+5,722)

mit

QuivrHQ/MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

4,594 (+4,582)

apache-2.0

0voice/expert_readed_books

2021年最新总结，推荐工程师合适读本，计算机科学，软件技术，创业，思想类，数学类，人物传记书籍

9,113 (+4,490)

mfts/papermark

Papermark is the open-source DocSend alternative with built-in analytics and custom domains.

5,813 (+4,392)

agpl-3.0

forthespada/CS-Books

🔥🔥超过1000本的计算机经典书籍、个人笔记资料以及本人在各平台发表文章中所涉及的资源等。书籍资源包括C/C++、Java、Python、Go语言、数据结构与算法、操作系统、后端架构、计算机系统知识、数据库、计算机网络、设计模式、前端、汇编以及校招社招各种面经~

21,544 (+4,336)

docusealco/docuseal

Open source DocuSign alternative. Create, fill, and sign digital documents ✍️

8,188 (+4,152)

agpl-3.0

documenso/documenso

The Open Source DocuSign Alternative.

8,893 (+3,964)

agpl-3.0

ocrmypdf/OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

14,404 (+3,694)

mpl-2.0

Last 12-months (relative gain)

DS4SD/docling

Get your documents ready for gen AI

15,959 (+177,222%)

mit

getomni-ai/zerox

PDF to Markdown with vision models

7,086 (+64,318%)

mit

QuivrHQ/MegaParse

File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.

4,594 (+38,183%)

apache-2.0

CatchTheTornado/pdf-extract-api

Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

1,540 (+30,700%)

gpl-3.0

seekbytes/IPA

GUI analyzer for deep-diving into PDF files. Detect malicious payloads, understand object relationships, and extract key information for threat analysis.

807 (+20,075%)

gpl-2.0

RyotaUshio/obsidian-pdf-plus

PDF++: The most Obsidian-native PDF annotation & viewing tool ever. Comes with optional Vim keybindings.

910 (+12,900%)

mit

superlinear-ai/raglite

🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

604 (+7,450%)

mpl-2.0

spatie/laravel-pdf

Create PDF files in Laravel apps

738 (+5,577%)

mit

felipeall/resumeio-to-pdf

Download your resume from resume.io as PDF

445 (+4,350%)

mit

chinapandaman/PyPDFForm

:fire: The Python library for PDF forms.

475 (+3,293%)

mit

jimmc414/1filellm

Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion

641 (+2,464%)

mit

pdfslick/pdfslick

View and Interact with PDFs in React, SolidJS, Svelte and JavaScript apps

386 (+2,171%)

mit

amithkoujalgi/ollama-pdf-bot

A bot that accepts PDF docs and lets you ask questions on it.

172 (+2,050%)

mrmn2/PdfDing

Selfhosted PDF manager and viewer offering a seamless user experience on multiple devices.

91 (+1,720%)

gpl-3.0

Bklieger/Semantic

SemanticPDF: Drag, Drop, Semantic Search - SemanticPDF is a simple, privacy-focused application that makes it easy to upload a PDF file and perform a semantic search on contents.

59 (+1,375%)

mit

ThisIsSakshi/Books

Books and other resources

155 (+1,309%)

mit

KAYOKG/BibliotecaDev

📚 Biblioteca de livros essenciais da área da programação. (Confira o meu novo projeto `SendScriptWhatsapp`)

6,161 (+1,303%)

mit

DS4SD/quackling

Build document-native LLM applications

51 (+1,175%)

mit

I2Djs/pdf-frame

pdf-frame is a web framework designed specifically for handling PDF and Canvas graphics requirements. It provides component support for popular frameworks like Vue, Nuxt and React. With its declarativ...

50 (+1,150%)

mit

lfoppiano/structure-vision

Viewer for the structure extracted by Grobid on PDF documents

42 (+950%)

apache-2.0