21 results found Sort:

753
9.8k
agpl-3.0
90
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Created 2018-05-11
1,745 commits to master branch, last one 11 days ago
1.1k
5.8k
apache-2.0
47
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team colla...
Created 2021-08-01
11,246 commits to main branch, last one a day ago
542
3.3k
apache-2.0
80
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Created 2018-08-07
268 commits to master branch, last one 3 days ago
271
2.9k
mit
21
Compare tables within or across databases
This repository has been archived (exclude archived)
Created 2022-03-07
1,932 commits to master branch, last one 7 months ago
217
2.0k
apache-2.0
14
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Created 2020-12-14
764 commits to main branch, last one 19 days ago
123
1.6k
other
23
re_data - fix data issues before your users & CEO would discover them 😊
Created 2020-11-02
503 commits to master branch, last one 7 months ago
121
968
agpl-3.0
16
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Created 2021-08-25
2,344 commits to main branch, last one 2 days ago
ML powered analytics engine for outlier detection and root cause analysis.
This repository has been archived (exclude archived)
Created 2021-05-27
3,308 commits to main branch, last one 3 months ago
181
602
lgpl-3.0
63
The premier open source Data Quality solution
Created 2014-02-20
6,796 commits to main branch, last one about a month ago
163
548
apache-2.0
12
Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
Created 2022-04-02
357 commits to dev branch, last one 14 days ago
21
179
apache-2.0
9
Possibly the fastest DataFrame-agnostic quality check library in town.
Created 2021-10-05
590 commits to main branch, last one 7 days ago
11
127
apache-2.0
5
Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.
Created 2023-04-02
30 commits to main branch, last one about a year ago
Frontend for the osmcha-django REST API
Created 2017-04-04
1,116 commits to main branch, last one about a month ago
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility acr...
Created 2024-04-18
84 commits to main branch, last one 18 days ago
DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling,  new dataset...
Created 2024-04-17
378 commits to main branch, last one 18 days ago
30
46
lgpl-3.0
6
数据质量控制系统
Created 2019-09-16
190 commits to master branch, last one 3 years ago
10
40
apache-2.0
2
Make simple storing test results and visualisation of these in a BI dashboard
Created 2023-01-12
52 commits to main branch, last one 17 days ago
Datailot-cli is the command line interface for accessing the AI teammate for engineers to ensure best practices in their SQL and dbt projects.
Created 2024-01-25
59 commits to main branch, last one about a month ago