48 results found Sort:

207
2.3k
apache-2.0
15
A framework for prompt tuning using Intent-based Prompt Calibration
Created 2023-12-02
165 commits to main branch, last one 2 months ago
160
2.2k
apache-2.0
19
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created 2023-10-16
840 commits to main branch, last one 3 days ago
Perception toolkit for sim2real training and validation in Unity
Created 2020-04-03
1,439 commits to main branch, last one 2 months ago
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models.   🤖💤
Created 2023-06-02
81 commits to main branch, last one a day ago
46
733
bsd-3-clause
11
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Created 2024-02-25
27 commits to main branch, last one 4 months ago
Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
Created 2023-09-07
47 commits to master branch, last one 6 months ago
44
610
apache-2.0
5
Synthetic Data curation for post-training and structured data extraction
Created 2024-10-28
1,132 commits to main branch, last one 23 hours ago
Official repository for ICLR 2025 paper "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
Created 2024-06-12
71 commits to main branch, last one 9 days ago
A curated list of awesome projects which use Machine Learning to generate synthetic content.
Created 2019-02-19
50 commits to master branch, last one 2 years ago
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Created 2021-05-31
1,014 commits to dev branch, last one 4 months ago
56
367
apache-2.0
18
SynthDet - An end-to-end object detection pipeline using synthetic data
This repository has been archived (exclude archived)
Created 2020-03-26
158 commits to master branch, last one about a month ago
Random dataframe and database table generator
Created 2018-03-10
73 commits to master branch, last one 3 years ago
74
303
bsd-3-clause-clear
5
[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
Created 2019-09-28
26 commits to master branch, last one about a year ago
4
278
cc-by-4.0
29
[NeurIPS D&B Track 2024] Official implementation of HumanVid
Created 2024-07-19
21 commits to main branch, last one 27 days ago
13
270
unknown
5
Compose multimodal datasets 🎹
Created 2024-02-17
134 commits to main branch, last one about a month ago
awesome synthetic (text) datasets
Created 2024-02-21
41 commits to main branch, last one 3 months ago
A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
Created 2022-11-07
36 commits to main branch, last one about a month ago
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Created 2024-03-21
3 commits to main branch, last one 10 months ago
25
171
mit
4
[CVPR 2021] DeFMO: Deblurring and Shape Recovery of Fast Moving Objects
Created 2021-02-06
52 commits to master branch, last one 2 years ago
This is the dataset and code release of the OpenRooms Dataset. For more information, please refer to our webpage below. Thanks a lot for your interest in our research!
Created 2021-05-17
110 commits to main branch, last one 10 months ago
12
144
apache-2.0
2
Solving data for LLMs - Create quality synthetic datasets!
Created 2024-06-22
126 commits to main branch, last one 11 days ago
BEDLAM (CVPR 2023) render pipeline tools
Created 2023-06-20
9 commits to main branch, last one 9 months ago
NVIDIA Dataset Utilities (NVDU)
Created 2018-07-12
15 commits to master branch, last one 5 years ago
Dataset Diffusion: Diffusion-based Synthetic Data Generation for Pixel-Level Semantic Segmentation (NeurIPS2023)
Created 2023-09-25
15 commits to main branch, last one 4 months ago
Repository to identify Lego bricks automatically only using images
Created 2018-09-18
43 commits to master branch, last one 3 years ago
Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.
Created 2021-11-18
142 commits to main branch, last one 10 days ago
Optimize Document Retrieval with Fine-Tuned KnowledgeBases
Created 2025-01-22
141 commits to main branch, last one 3 days ago