20 results found Sort:
- Filter by Primary Language:
- Python (12)
- Java (3)
- HTML (1)
- Scala (1)
- TypeScript (1)
- PLpgSQL (1)
- JavaScript (1)
- +
Always know what to expect from your data.
Created
2017-09-11
12,425 commits to develop branch, last one 18 hours ago
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Created
2018-05-11
1,652 commits to master branch, last one 21 hours ago
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team colla...
Created
2021-08-01
9,690 commits to main branch, last one 19 hours ago
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Created
2018-08-07
261 commits to master branch, last one about a month ago
Compare tables within or across databases
This repository has been archived
(exclude archived)
Created
2022-03-07
1,932 commits to master branch, last one about a month ago
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Created
2020-12-14
705 commits to main branch, last one 22 hours ago
re_data - fix data issues before your users & CEO would discover them 😊
Created
2020-11-02
503 commits to master branch, last one 2 months ago
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Created
2021-08-25
1,822 commits to main branch, last one 7 days ago
ML powered analytics engine for outlier detection and root cause analysis.
Created
2021-05-27
3,306 commits to main branch, last one 3 months ago
The premier open source Data Quality solution
Created
2014-02-20
6,778 commits to master branch, last one 4 months ago
Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
Created
2022-04-02
330 commits to dev branch, last one a day ago
Library for Semi-Automated Data Science
Created
2019-08-06
1,479 commits to master branch, last one about a month ago
Open Source Data Quality Monitoring.
Created
2023-07-15
128 commits to main branch, last one 4 months ago
Possibly the fastest DataFrame-agnostic quality check library in town.
Created
2021-10-05
532 commits to main branch, last one 4 days ago
Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.
Created
2023-04-02
30 commits to main branch, last one 6 months ago
Frontend for the osmcha-django REST API
Created
2017-04-04
1,091 commits to master branch, last one 20 days ago
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility acr...
Created
2024-04-18
66 commits to main branch, last one 10 days ago
Amora Data Build Tool enables analysts and engineers to transform data on the data warehouse (BigQuery) by writing Amora Models that describe the data schema using Python's "PEP484 - Type Hints" and s...
Created
2021-09-28
1,043 commits to main branch, last one 7 months ago
Make simple storing test results and visualisation of these in a BI dashboard
Created
2023-01-12
28 commits to main branch, last one 4 months ago
DataOps TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling, new dataset screening an...
Created
2024-04-17
58 commits to main branch, last one 15 days ago