12 results found Sort:

4.4k
47.8k
apache-2.0
224
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Created 2023-12-12
2,741 commits to main branch, last one 20 hours ago
1.6k
26.0k
mit
112
Get your documents ready for gen AI
Created 2024-07-09
432 commits to main branch, last one 21 hours ago
892
10.8k
apache-2.0
69
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Created 2022-09-26
1,711 commits to main branch, last one 3 days ago
291
3.8k
apache-2.0
30
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
Created 2024-01-10
847 commits to main branch, last one about a month ago
117
2.9k
mit
23
Improved file parsing for LLM’s
Created 2024-03-22
169 commits to main branch, last one 4 months ago
Parse PDFs into markdown using Vision LLMs
Created 2024-12-16
112 commits to main branch, last one about a month ago
Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pipelines (GenAI, LLM, VLLM) into your applications, supporting va...
Created 2021-09-23
2,127 commits to main branch, last one 18 days ago
Tutorial on how to deskew (straighten) text images
Created 2020-09-05
4 commits to master branch, last one 3 years ago
A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines...
Created 2023-03-31
109 commits to main branch, last one 17 days ago
The invoice, document, and resume parser powered by AI.
Created 2023-01-22
38 commits to main branch, last one 4 months ago