17 results found Sort:
- Filter by Primary Language:
- Python (10)
- TypeScript (2)
- Cython (1)
- Java (1)
- Rust (1)
- Visual Basic .NET (1)
- +
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
Created
2024-02-29
1,654 commits to master branch, last one 20 hours ago
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
Created
2012-01-06
1,770 commits to main branch, last one 2 days ago
一款美观且功能强大的在线设计工具,具备海报设计和图片编辑功能,基于fabric.js的开源版【稿定设计】。适用于多种场景,如海报生成、电商产品图制作、文章长图设计、视频/公众号封面编辑等 。A beautiful and powerful online design tool
Created
2023-05-25
896 commits to main branch, last one 23 hours ago
Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.
Created
2024-05-10
169 commits to master branch, last one 22 days ago
Python PDF parser for scientific publications: content and figures
Created
2019-07-03
59 commits to master branch, last one 7 months ago
Analyze PDFs. With colors. And Yara.
Created
2022-09-14
216 commits to master branch, last one 11 days ago
A package for parsing PDFs and analyzing their content using LLMs.
Created
2024-07-26
28 commits to main branch, last one 3 months ago
Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic...
Created
2020-12-07
295 commits to master branch, last one about a month ago
A python client for the Sypht API
Created
2018-08-20
212 commits to master branch, last one about a year ago
Parser for Consolidated Account Statements (CAS) generated from CAMS/Karvy/Kfintech
Created
2020-10-10
243 commits to main branch, last one 9 months ago
A Java client for the Sypht API
Created
2019-04-05
85 commits to master branch, last one 4 years ago
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Created
2024-06-04
219 commits to main branch, last one 2 days ago
Sample code for the Datalogics C++, Java, and .NET interfaces of the Adobe PDF Library
Created
2017-03-28
247 commits to master branch, last one about a year ago
C# and VB.NET samples for Docotic.Pdf library
Created
2017-12-13
555 commits to master branch, last one 15 days ago
Next.js template for seamless PDF parsing using pdf2json and FilePond. Ideal for developers seeking a ready-to-use solution for PDF content extraction in Next.js projects.
Created
2023-08-03
10 commits to main branch, last one 11 months ago
PDF parsing toolkit for preparing academic text corpus
Created
2023-09-03
4 commits to main branch, last one 3 months ago
Fast and memory-efficient Python PDF Parser based on xpdf sources
Created
2020-03-28
318 commits to dev branch, last one 2 years ago