36 results found Sort:

1.2k
14.4k
apache-2.0
133
🦉 Data Versioning and ML Experiments
Created 2017-03-04
9,393 commits to main branch, last one 11 days ago
445
5.1k
agpl-3.0
41
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
Created 2024-02-21
1,004 commits to main branch, last one 18 hours ago
258
3.4k
apache-2.0
29
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Created 2021-07-13
1,586 commits to main branch, last one 6 months ago
566
3.3k
apache-2.0
26
Neo4j graph construction from unstructured data using LLMs
Created 2024-01-11
1,338 commits to main branch, last one 15 hours ago
🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
Created 2022-01-13
925 commits to main branch, last one a day ago
624
2.1k
apache-2.0
35
Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
Created 2019-08-09
3,455 commits to master branch, last one 21 hours ago
185
1.6k
unknown
28
Interact, analyze and structure massive text, image, embedding, audio and video datasets
Created 2022-07-21
1,084 commits to main branch, last one 2 months ago
242
1.4k
apache-2.0
136
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ...
Created 2021-10-14
1,352 commits to develop branch, last one 20 hours ago
79
1.3k
mit
11
Get clean data from tricky documents, powered by vision-language models ⚡
Created 2024-03-22
333 commits to main branch, last one a day ago
86
1.2k
mit
18
Interactively explore unstructured datasets from your dataframe.
Created 2023-01-29
1,527 commits to main branch, last one 4 months ago
100
1.2k
apache-2.0
15
LOTUS: A semantic query engine for fast and easy LLM-powered data processing
Created 2024-07-16
167 commits to main branch, last one 4 hours ago
45
1.1k
apache-2.0
14
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Created 2024-06-04
305 commits to main branch, last one 4 months ago
61
1.0k
other
13
Visual Data Transformation and Data Preparation. Low-Code Python-based ETL.
Created 2024-03-20
312 commits to main branch, last one 2 days ago
99
1.0k
apache-2.0
13
Curate better data for LLMs
Created 2023-03-23
950 commits to main branch, last one about a year ago
77
838
gpl-3.0
7
Enterprise-grade and API-first LLM workspace for unstructured documents, including data extraction, redaction, rights management, prompt playground, and more!
Created 2022-10-24
1,243 commits to main branch, last one 19 days ago
Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.
Created 2023-10-31
32 commits to main branch, last one about a year ago
python implementation of jordansissel's grok regular expression library
Created 2014-07-17
93 commits to master branch, last one 6 years ago
11
275
bsd-2-clause
3
Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.
Created 2024-02-16
42 commits to main branch, last one about a month ago
Open-source unstructured data (PDFs, Images, Audiofiles) processing platform built for knowledge workers
Created 2025-01-09
121 commits to master branch, last one about a month ago
Enforce structured output from LLMs 100% of the time
Created 2023-07-12
1 commits to main branch, last one 9 months ago
33
228
apache-2.0
14
Home of the AI workforce - Multi-agent system, AI agents & tools
Created 2021-07-05
5,806 commits to main branch, last one 2 months ago
Model Context Protocol (MCP) Server for Graphlit Platform
Created 2025-03-01
173 commits to main branch, last one 21 hours ago
Structured Data Extractor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a single tool call.
Created 2024-07-11
147 commits to master branch, last one 23 days ago
20
161
apache-2.0
1
RAG-QA-Generator 是一个用于检索增强生成(RAG)系统的自动化知识库构建与管理工具。该工具通过读取文档数据,利用大规模语言模型生成高质量的问答对(QA对),并将这些数据插入数据库中,实现RAG系统知识库的自动化构建和管理。
Created 2024-07-17
16 commits to master branch, last one 3 months ago
How to construct knowledge graphs from unstructured data sources
Created 2024-07-31
18 commits to main branch, last one 6 months ago
11
121
unknown
3
Accurate, private and configurable document retrieval LLM
Created 2024-03-14
300 commits to main branch, last one a day ago
9
104
apache-2.0
1
An on-premises, OCR-free unstructured data extraction tool powered by vision language models.
Created 2025-03-25
107 commits to main branch, last one 5 days ago