48 results found Sort:
- Filter by Primary Language:
- Python (31)
- Jupyter Notebook (10)
- C# (3)
- Rich Text Format (1)
- +
A framework for prompt tuning using Intent-based Prompt Calibration
Created
2023-12-02
165 commits to main branch, last one 2 months ago
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created
2023-10-16
840 commits to main branch, last one 3 days ago
Perception toolkit for sim2real training and validation in Unity
Created
2020-04-03
1,439 commits to main branch, last one 2 months ago
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Created
2023-06-02
81 commits to main branch, last one a day ago
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Created
2024-02-25
27 commits to main branch, last one 4 months ago
Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
Created
2023-09-07
47 commits to master branch, last one 6 months ago
Synthetic Data curation for post-training and structured data extraction
Created
2024-10-28
1,132 commits to main branch, last one 23 hours ago
Official repository for ICLR 2025 paper "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
Created
2024-06-12
71 commits to main branch, last one 9 days ago
A curated list of awesome projects which use Machine Learning to generate synthetic content.
Created
2019-02-19
50 commits to master branch, last one 2 years ago
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Created
2021-05-31
1,014 commits to dev branch, last one 4 months ago
SynthDet - An end-to-end object detection pipeline using synthetic data
This repository has been archived
(exclude archived)
Created
2020-03-26
158 commits to master branch, last one about a month ago
Unity's privacy-preserving human-centric synthetic data generator
unity
unity3d
labeling
icml-2022
perception
billing-5160
deep-learning
synthetic-data
computer-vision
pose-estimation
human-centric-ml
object-detection
transfer-learning
synthetic-datasets
applied-ml-research
human-pose-estimation
owner-machine-learning
synthetic-data-generation
human-activity-recognition
synthetic-dataset-generation
Created
2021-08-24
240 commits to main branch, last one 11 months ago
Random dataframe and database table generator
Created
2018-03-10
73 commits to master branch, last one 3 years ago
[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
Created
2019-09-28
26 commits to master branch, last one about a year ago
[NeurIPS D&B Track 2024] Official implementation of HumanVid
Created
2024-07-19
21 commits to main branch, last one 27 days ago
Compose multimodal datasets 🎹
Created
2024-02-17
134 commits to main branch, last one about a month ago
awesome synthetic (text) datasets
Created
2024-02-21
41 commits to main branch, last one 3 months ago
A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
Created
2022-11-07
36 commits to main branch, last one about a month ago
DataGene - Identify How Similar TS Datasets Are to One Another (by @firmai)
finance
encoding
synthesizers
decomposition
model-checking
synthetic-data
data-structures
similarity-score
distance-measures
testing-framework
dataset-generation
dataset-similarity
similarity-measures
data-transformations
distance-calculations
predictive-maintenance
transformation-recipes
synthetic-dataset-generation
Created
2020-05-09
144 commits to master branch, last one 2 years ago
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Created
2024-03-21
3 commits to main branch, last one 10 months ago
[CVPR 2021] DeFMO: Deblurring and Shape Recovery of Fast Moving Objects
Created
2021-02-06
52 commits to master branch, last one 2 years ago
This is the dataset and code release of the OpenRooms Dataset. For more information, please refer to our webpage below. Thanks a lot for your interest in our research!
Created
2021-05-17
110 commits to main branch, last one 10 months ago
Solving data for LLMs - Create quality synthetic datasets!
Created
2024-06-22
126 commits to main branch, last one 11 days ago
BEDLAM (CVPR 2023) render pipeline tools
Created
2023-06-20
9 commits to main branch, last one 9 months ago
NVIDIA Dataset Utilities (NVDU)
Created
2018-07-12
15 commits to master branch, last one 5 years ago
Dataset Diffusion: Diffusion-based Synthetic Data Generation for Pixel-Level Semantic Segmentation (NeurIPS2023)
Created
2023-09-25
15 commits to main branch, last one 4 months ago
Repository to identify Lego bricks automatically only using images
Created
2018-09-18
43 commits to master branch, last one 3 years ago
(SIGCOMM '22) Practical GAN-based Synthetic IP Header Trace Generation using NetShare
gan
gans
pcap
netflow
privacy
netflow-v9
tensorflow
gans-models
time-series
netflow-data
pcap-generator
synthetic-data
machine-learning
privacy-preserving
differential-privacy
synthetic-data-generator
synthetic-data-generation
synthetic-dataset-generation
generative-adversarial-network
differential-privacy-deep-learning
Created
2022-06-15
167 commits to master branch, last one about a year ago
Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.
Created
2021-11-18
142 commits to main branch, last one 10 days ago
Optimize Document Retrieval with Fine-Tuned KnowledgeBases
Created
2025-01-22
141 commits to main branch, last one 3 days ago