6 results found Sort:

51
253
apache-2.0
36
GUNDAM is a data management system that prioritizes data using language models.
Created 2023-06-05
54 commits to main branch, last one about a year ago
24
242
lgpl-3.0
9
Graphical tool for data manipulation written in C++/Qt.
Created 2019-01-19
571 commits to main branch, last one 9 days ago
19
236
mit
21
DSIR large-scale data selection framework for language model training
Created 2023-01-30
67 commits to main branch, last one 9 months ago
12
151
mit
10
⏳ Provide filtering, sanitizing, and conversion of Golang data. 提供对Golang数据的过滤,净化,转换。
Created 2018-09-26
161 commits to master branch, last one 11 days ago
14
60
mit
1
Exponentially Weighted Moving Average Filter
Created 2018-10-01
27 commits to master branch, last one 5 years ago
1
42
apache-2.0
3
Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
Created 2024-02-27
29 commits to main branch, last one about a month ago