10 results found Sort:

32
1.1k
other
9
Open-source platform for extracting structured data from documents using AI.
Created 2024-11-17
36 commits to main branch, last one 2 days ago
69
477
apache-2.0
10
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
Created 2024-02-01
233 commits to main branch, last one a day ago
116
373
gpl-3.0
28
Generic framework for historical document processing
Created 2017-07-13
381 commits to master branch, last one 4 years ago
23
149
apache-2.0
11
:zap: Cloud-native, AI-powered, document processing pipelines on AWS.
Created 2023-11-23
365 commits to main branch, last one about a month ago
A full-featured Document Layer for your application, providing the functionality of a flexible document management system, including storage, discovery, processing, and retrieval. Deploys directly int...
Created 2020-11-10
244 commits to master branch, last one 22 days ago
Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and HTML files. Extensive support of tabular data extraction and multimodal queries.
Created 2024-02-14
230 commits to master branch, last one 8 days ago
4
57
apache-2.0
5
A Python framework for multi-modal document understanding with Amazon Bedrock
Created 2024-04-17
68 commits to main branch, last one 2 months ago
Enhanced Document Understanding on AWS delivers an easy-to-use web application that ingests and analyzes documents, extracts content, identifies and redacts sensitive customer information, and creates...
Created 2023-08-16
81 commits to main branch, last one 4 days ago
Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced info...
Created 2024-09-10
26 commits to main branch, last one 29 days ago
A comprehensive list of annotated training datasets classified by use case.
Created 2022-05-25
205 commits to main branch, last one 2 years ago