36 results found Sort:
- Filter by Primary Language:
- Python (23)
- TypeScript (5)
- Jupyter Notebook (2)
- HTML (2)
- Makefile (1)
- Java (1)
- Rust (1)
- +
🦉 Data Versioning and ML Experiments
Created
2017-03-04
9,393 commits to main branch, last one 11 days ago
Refine high-quality datasets and visual AI models
Created
2020-04-22
23,211 commits to develop branch, last one 3 days ago
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
Created
2024-02-21
1,004 commits to main branch, last one 18 hours ago
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Created
2021-07-13
1,586 commits to main branch, last one 6 months ago
Neo4j graph construction from unstructured data using LLMs
Created
2024-01-11
1,338 commits to main branch, last one 15 hours ago
🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
Created
2022-01-13
925 commits to main branch, last one a day ago
Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
Created
2019-08-09
3,455 commits to master branch, last one 21 hours ago
Interact, analyze and structure massive text, image, embedding, audio and video datasets
Created
2022-07-21
1,084 commits to main branch, last one 2 months ago
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ...
Created
2021-10-14
1,352 commits to develop branch, last one 20 hours ago
A curated list of resources for Document Understanding (DU) topic
nlp
ocr
pdf
rpa
awesome
document-ai
awesome-list
deep-learning
pdf-documents
machine-learning
document-analysis
unstructured-data
document-intelligence
document-understanding
information-extraction
intelligent-processing
document-layout-analysis
key-information-extraction
robotic-process-automation
natural-language-processing
Created
2021-04-06
76 commits to main branch, last one about a year ago
Get clean data from tricky documents, powered by vision-language models ⚡
Created
2024-03-22
333 commits to main branch, last one a day ago
Interactively explore unstructured datasets from your dataframe.
Created
2023-01-29
1,527 commits to main branch, last one 4 months ago
LOTUS: A semantic query engine for fast and easy LLM-powered data processing
Created
2024-07-16
167 commits to main branch, last one 4 hours ago
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Created
2024-06-04
305 commits to main branch, last one 4 months ago
Visual Data Transformation and Data Preparation. Low-Code Python-based ETL.
Created
2024-03-20
312 commits to main branch, last one 2 days ago
Curate better data for LLMs
Created
2023-03-23
950 commits to main branch, last one about a year ago
Enterprise-grade and API-first LLM workspace for unstructured documents, including data extraction, redaction, rights management, prompt playground, and more!
Created
2022-10-24
1,243 commits to main branch, last one 19 days ago
NucliaDB, The AI Search database for RAG
Created
2022-04-05
3,186 commits to main branch, last one 5 days ago
Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.
Created
2023-10-31
32 commits to main branch, last one about a year ago
python implementation of jordansissel's grok regular expression library
Created
2014-07-17
93 commits to master branch, last one 6 years ago
Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.
Created
2024-02-16
42 commits to main branch, last one about a month ago
Open-source unstructured data (PDFs, Images, Audiofiles) processing platform built for knowledge workers
Created
2025-01-09
121 commits to master branch, last one about a month ago
Enforce structured output from LLMs 100% of the time
Created
2023-07-12
1 commits to main branch, last one 9 months ago
Home of the AI workforce - Multi-agent system, AI agents & tools
Created
2021-07-05
5,806 commits to main branch, last one 2 months ago
Model Context Protocol (MCP) Server for Graphlit Platform
Created
2025-03-01
173 commits to main branch, last one 21 hours ago
Structured Data Extractor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a single tool call.
Created
2024-07-11
147 commits to master branch, last one 23 days ago
RAG-QA-Generator 是一个用于检索增强生成(RAG)系统的自动化知识库构建与管理工具。该工具通过读取文档数据,利用大规模语言模型生成高质量的问答对(QA对),并将这些数据插入数据库中,实现RAG系统知识库的自动化构建和管理。
Created
2024-07-17
16 commits to master branch, last one 3 months ago
How to construct knowledge graphs from unstructured data sources
Created
2024-07-31
18 commits to main branch, last one 6 months ago
Accurate, private and configurable document retrieval LLM
Created
2024-03-14
300 commits to main branch, last one a day ago
An on-premises, OCR-free unstructured data extraction tool powered by vision language models.
Created
2025-03-25
107 commits to main branch, last one 5 days ago