19 results found Sort:

684
8.9k
agpl-3.0
85
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Created 2018-05-11
1,629 commits to master branch, last one 7 days ago
510
6.8k
apache-2.0
53
The open-source tool for building high-quality datasets and computer vision models
Created 2020-04-22
20,368 commits to develop branch, last one a day ago
173
3.0k
other
109
A Doctor for your data
Created 2023-05-02
32 commits to master branch, last one 4 months ago
65
1.4k
apache-2.0
16
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
Created 2022-07-04
68 commits to main branch, last one about a month ago
118
1.1k
apache-2.0
68
Resources for Data Centric AI
Created 2021-06-11
296 commits to main branch, last one about a year ago
83
1.0k
mit
18
Interactively explore unstructured datasets from your dataframe.
Created 2023-01-29
1,464 commits to main branch, last one 3 days ago
A curated, but incomplete, list of data-centric AI resources.
Created 2023-03-07
68 commits to main branch, last one about a month ago
69
935
agpl-3.0
15
Automatically find issues in image datasets and practice data-centric computer vision.
Created 2022-05-26
335 commits to main branch, last one 2 months ago
140
404
agpl-3.0
13
Lab assignments for Introduction to Data-Centric AI, MIT IAP 2024 👩🏽‍💻
Created 2022-12-05
37 commits to master branch, last one 5 months ago
30
212
apache-2.0
6
[NeurIPS 2021] WRENCH: Weak supeRvision bENCHmark
Created 2021-08-23
173 commits to main branch, last one 3 months ago
9
121
apache-2.0
3
[NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.
Created 2023-05-31
26 commits to main branch, last one 7 months ago
Introduction to Data-Centric AI, MIT IAP 2023 🤖
Created 2022-12-05
209 commits to master branch, last one 6 days ago
9
84
lgpl-3.0
4
pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation
Created 2021-04-02
3,467 commits to develop branch, last one 22 days ago
OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)
Created 2023-06-07
316 commits to main branch, last one 3 months ago
Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning
Created 2023-11-14
6 commits to master branch, last one 5 months ago
nbsynthetic is simple and robust tabular synthetic data generation library for small and medium size datasets
Created 2022-08-29
246 commits to master branch, last one about a year ago
[ECCV 2022] Official Implementation for Unsupervised Selective Labeling for More Effective Semi-Supervised Learning
Created 2022-07-20
11 commits to main branch, last one 10 months ago
3
47
unknown
3
A Data Centric NER annotation tool for your Named Entity Recognition projects
Created 2020-09-13
32 commits to main branch, last one about a month ago