123 results found Sort:
- Filter by Primary Language:
- Python (82)
- Jupyter Notebook (18)
- C# (4)
- C++ (4)
- Java (2)
- Go (2)
- Rust (2)
- JavaScript (1)
- MATLAB (1)
- Rich Text Format (1)
- +
Code for Machine Learning for Algorithmic Trading, 2nd edition.
Created
2018-05-09
351 commits to main branch, last one about a year ago
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
Created
2016-09-09
2,653 commits to master branch, last one about a month ago
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
Created
2023-08-24
2,153 commits to main branch, last one 6 hours ago
SDG is a specialized framework designed to generate high-quality structured tabular data.
Created
2023-08-10
260 commits to main branch, last one 22 hours ago
A procedural Blender pipeline for photorealistic training image generation
Created
2019-10-10
5,160 commits to main branch, last one 15 days ago
Synthetic data generation for tabular data
Created
2018-05-11
1,780 commits to main branch, last one 2 days ago
Synthetic Patient Population Simulator
Created
2016-06-17
4,856 commits to master branch, last one 6 days ago
UnrealCV: Connecting Computer Vision to Unreal Engine
Created
2016-09-08
1,179 commits to 5.2 branch, last one 4 months ago
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created
2023-10-16
778 commits to main branch, last one 21 days ago
Synthetic data generators for tabular and time-series data
Created
2020-05-04
255 commits to dev branch, last one a day ago
The Declarative Data Generator
Created
2020-08-09
350 commits to master branch, last one about a month ago
Conditional GAN for generating synthetic tabular data.
Created
2019-09-08
395 commits to main branch, last one a day ago
PostgreSQL database anonymization and synthetic data generation tool
Created
2023-12-01
454 commits to main branch, last one a day ago
A tool that uses advanced Monte Carlo simulations and Turbit parallel processing to create possible Bitcoin prediction scenarios.
Created
2024-08-02
6 commits to main branch, last one 3 months ago
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Created
2023-06-02
72 commits to main branch, last one 3 months ago
Curated list of open source tooling for data-centric AI on unstructured data.
nlp
data-drift
awesome-list
noisy-labels
data-curation
deep-learning
bias-detection
explainable-ai
feature-vector
synthetic-data
active-learning
computer-vision
data-centric-ai
data-versioning
machine-learning
outlier-detection
data-visualization
documentation-only
uncertainty-estimation
robust-machine-learning
Created
2023-02-27
34 commits to main branch, last one 11 months ago
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Created
2024-02-25
27 commits to main branch, last one about a month ago
Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
Created
2023-09-07
47 commits to master branch, last one 3 months ago
A multi-purpose LLM framework for RAG and data creation.
This repository has been archived
(exclude archived)
Created
2023-09-15
196 commits to main branch, last one 9 months ago
Synthetic data generators for structured and unstructured text, featuring differentially private learning.
Created
2020-03-02
343 commits to master branch, last one 7 days ago
A curated list of awesome projects which use Machine Learning to generate synthetic content.
Created
2019-02-19
50 commits to master branch, last one about a year ago
A library to model multivariate data using copulas.
Created
2017-11-13
867 commits to main branch, last one 2 days ago
Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
Created
2024-06-12
64 commits to main branch, last one a day ago
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
Created
2022-03-18
165 commits to main branch, last one about a month ago
Official code for our CVPR '22 paper "Dataset Distillation by Matching Training Trajectories"
Created
2022-03-21
17 commits to main branch, last one 3 months ago
[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
Created
2022-10-02
8 commits to main branch, last one about a year ago
[IROS 2020] se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains
Created
2020-02-23
14 commits to master branch, last one about a year ago
SynthDet - An end-to-end object detection pipeline using synthetic data
Created
2020-03-26
157 commits to master branch, last one about a year ago
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POC...
Created
2019-07-23
296 commits to master branch, last one 3 months ago
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Created
2021-05-31
1,014 commits to dev branch, last one about a month ago