10 results found Sort:

105
1.0k
apache-2.0
15
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
Created 2024-02-01
345 commits to main branch, last one 7 days ago
116
374
gpl-3.0
28
Generic framework for historical document processing
Created 2017-07-13
381 commits to master branch, last one 4 years ago
24
164
apache-2.0
12
:zap: Cloud-native, AI-powered, document processing pipelines on AWS.
Created 2023-11-23
365 commits to main branch, last one 2 months ago
A full-featured Document Layer for your application, providing the functionality of a flexible document management system, including storage, discovery, processing, and retrieval. Deploys directly int...
Created 2020-11-10
264 commits to master branch, last one 4 days ago
5
68
apache-2.0
5
A Python framework for multi-modal document understanding with Amazon Bedrock
Created 2024-04-17
68 commits to main branch, last one 4 months ago
Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and HTML files. Extensive support of tabular data extraction and multimodal queries.
Created 2024-02-14
253 commits to master branch, last one 13 days ago
Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced info...
Created 2024-09-10
26 commits to main branch, last one 2 months ago
Enhanced Document Understanding on AWS delivers an easy-to-use web application that ingests and analyzes documents, extracts content, identifies and redacts sensitive customer information, and creates...
Created 2023-08-16
87 commits to main branch, last one 20 days ago
A comprehensive list of annotated training datasets classified by use case.
Created 2022-05-25
205 commits to main branch, last one 2 years ago
An advanced distributed knowledge fabric for intelligent document processing, featuring multi-document agents, optimized query handling, and semantic understanding.
Created 2024-04-22
16 commits to main branch, last one 6 months ago