10 results found Sort:

109
1.1k
apache-2.0
18
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.
Created 2024-02-01
382 commits to main branch, last one 2 days ago
115
374
gpl-3.0
27
Generic framework for historical document processing
Created 2017-07-13
381 commits to master branch, last one 4 years ago
25
172
apache-2.0
11
:zap: Cloud-native, AI-powered, document processing pipelines on AWS.
Created 2023-11-23
374 commits to main branch, last one 5 days ago
A full-featured Document Management Platform / Document Layer for your application, providing storage, discovery, processing, and retrieval. Deploys directly into your Amazon Web Services Cloud. Pleas...
Created 2020-11-10
268 commits to master branch, last one 11 days ago
6
76
apache-2.0
5
A Python framework for multi-modal document understanding with Amazon Bedrock
Created 2024-04-17
94 commits to main branch, last one 6 days ago
Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and HTML files. Extensive support of tabular data extraction and multimodal queries.
Created 2024-02-14
255 commits to master branch, last one 2 days ago
Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced info...
Created 2024-09-10
26 commits to main branch, last one 4 months ago
Enhanced Document Understanding on AWS delivers an easy-to-use web application that ingests and analyzes documents, extracts content, identifies and redacts sensitive customer information, and creates...
Created 2023-08-16
96 commits to main branch, last one 11 days ago
A comprehensive list of annotated training datasets classified by use case.
Created 2022-05-25
205 commits to main branch, last one 2 years ago
An advanced distributed knowledge fabric for intelligent document processing, featuring multi-document agents, optimized query handling, and semantic understanding.
Created 2024-04-22
16 commits to main branch, last one 7 months ago