133 results found Sort:
- Filter by Primary Language:
- Python (90)
- Jupyter Notebook (19)
- C# (4)
- C++ (4)
- JavaScript (2)
- Java (2)
- Go (2)
- Rust (2)
- MATLAB (1)
- Rich Text Format (1)
- +
Code for Machine Learning for Algorithmic Trading, 2nd edition.
Created
2018-05-09
351 commits to main branch, last one about a year ago
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
Created
2016-09-09
2,655 commits to master branch, last one 21 days ago
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
Created
2023-08-24
2,240 commits to main branch, last one a day ago
SDG is a specialized framework designed to generate high-quality structured tabular data.
Created
2023-08-10
276 commits to main branch, last one 18 days ago
A procedural Blender pipeline for photorealistic training image generation
Created
2019-10-10
5,178 commits to main branch, last one 5 days ago
Synthetic data generation for tabular data
Created
2018-05-11
1,810 commits to main branch, last one 3 days ago
Synthetic Patient Population Simulator
Created
2016-06-17
4,859 commits to master branch, last one about a month ago
UnrealCV: Connecting Computer Vision to Unreal Engine
Created
2016-09-08
1,179 commits to 5.2 branch, last one 5 months ago
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created
2023-10-16
779 commits to main branch, last one 2 days ago
Synthetic data generators for tabular and time-series data
Created
2020-05-04
257 commits to dev branch, last one 11 days ago
The Declarative Data Generator
Created
2020-08-09
350 commits to master branch, last one 2 months ago
Conditional GAN for generating synthetic tabular data.
Created
2019-09-08
399 commits to main branch, last one 25 days ago
PostgreSQL database anonymization and synthetic data generation tool
Created
2023-12-01
466 commits to main branch, last one 13 days ago
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Created
2023-06-02
72 commits to main branch, last one 4 months ago
A tool that uses advanced Monte Carlo simulations and Turbit parallel processing to create possible Bitcoin prediction scenarios.
Created
2024-08-02
6 commits to main branch, last one 4 months ago
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Created
2024-02-25
27 commits to main branch, last one 3 months ago
Curated list of open source tooling for data-centric AI on unstructured data.
nlp
data-drift
awesome-list
noisy-labels
data-curation
deep-learning
bias-detection
explainable-ai
feature-vector
synthetic-data
active-learning
computer-vision
data-centric-ai
data-versioning
machine-learning
outlier-detection
data-visualization
documentation-only
uncertainty-estimation
robust-machine-learning
Created
2023-02-27
34 commits to main branch, last one about a year ago
Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
Created
2023-09-07
47 commits to master branch, last one 5 months ago
A multi-purpose LLM framework for RAG and data creation.
This repository has been archived
(exclude archived)
Created
2023-09-15
196 commits to main branch, last one 11 months ago
Synthetic data generators for structured and unstructured text, featuring differentially private learning.
Created
2020-03-02
348 commits to master branch, last one 12 days ago
A curated list of awesome projects which use Machine Learning to generate synthetic content.
Created
2019-02-19
50 commits to master branch, last one about a year ago
A library to model multivariate data using copulas.
Created
2017-11-13
875 commits to main branch, last one about a month ago
Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
Created
2024-06-12
65 commits to main branch, last one 18 hours ago
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
Created
2022-03-18
165 commits to main branch, last one 2 months ago
Official code for our CVPR '22 paper "Dataset Distillation by Matching Training Trajectories"
Created
2022-03-21
17 commits to main branch, last one 5 months ago
[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
Created
2022-10-02
8 commits to main branch, last one about a year ago
[IROS 2020] se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains
Created
2020-02-23
14 commits to master branch, last one about a year ago
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POC...
Created
2019-07-23
300 commits to master branch, last one a day ago
SynthDet - An end-to-end object detection pipeline using synthetic data
This repository has been archived
(exclude archived)
Created
2020-03-26
158 commits to master branch, last one 15 days ago
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Created
2021-05-31
1,014 commits to dev branch, last one 3 months ago