63 results found Sort:
- Filter by Primary Language:
- Python (22)
- Go (12)
- TypeScript (5)
- Java (5)
- Shell (3)
- Jupyter Notebook (2)
- HTML (2)
- PLpgSQL (1)
- C++ (1)
- C# (1)
- PowerShell (1)
- Rust (1)
- Smarty (1)
- +
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Created
2018-05-11
1,772 commits to master branch, last one 4 days ago
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
Created
2019-10-21
4,646 commits to master branch, last one 2 days ago
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
Created
2022-07-07
2,296 commits to main branch, last one 3 days ago
Redpanda Console is a developer-friendly UI for managing your Kafka/Redpanda workloads. Console gives you a simple, interactive approach for gaining visibility into your topics, masking data, managing...
Created
2019-09-29
4,603 commits to master branch, last one 3 days ago
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collect...
Created
2020-08-14
936 commits to mainline branch, last one 3 months ago
Scalable and efficient data transformation framework - backwards compatible with dbt.
Created
2022-09-23
3,473 commits to main branch, last one 11 hours ago
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Created
2021-08-30
5,286 commits to master branch, last one 6 days ago
Kafka Docker for development. Kafka, Zookeeper, Schema Registry, Kafka-Connect, , 20+ connectors
Created
2016-08-19
571 commits to fdd/main branch, last one 8 months ago
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Created
2021-06-21
11,866 commits to main branch, last one a day ago
Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
Created
2022-03-16
1,466 commits to main branch, last one about a year ago
Support agile DataOps Based on Flink, DataX and Flink-CDC, Chunjun with Web-UI
Created
2019-01-23
1,189 commits to master branch, last one a day ago
📙 Awesome Data Catalogs and Observability Platforms.
Created
2021-07-14
94 commits to main branch, last one 12 days ago
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Created
2021-03-22
487 commits to main branch, last one about a year ago
Tenzir is the data pipeline engine for security teams.
Created
2010-09-23
24,257 commits to main branch, last one 4 days ago
DataOps for Microsoft Data Platform technologies. https://aka.ms/dataops-repo
Created
2019-12-06
781 commits to main branch, last one 4 days ago
A list of tools for annotating data, managing annotations, etc.
Created
2018-11-08
87 commits to master branch, last one 2 years ago
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
Created
2016-03-25
9,968 commits to master branch, last one 7 days ago
Titan Core - Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool for ...
Created
2023-05-13
276 commits to main branch, last one about a month ago
One framework to develop, deploy and operate data workflows with Python and SQL.
Created
2021-07-20
2,194 commits to main branch, last one 28 days ago
Open data platform based on Kubernetes. Scaleph supports SeaTunnel、Flink and Doris backended by SeaTunnel on Flink engine、Flink Kubernetes Operator and Doris operator.
Created
2022-04-23
886 commits to dev branch, last one 3 months ago
Power BI DevOps & Source Control Tool
Created
2020-05-31
713 commits to main branch, last one about a month ago
Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.
Created
2021-01-29
529 commits to main branch, last one about a year ago
The data-validation toolkit for enhanced dbt (data build tool) PR review
Created
2023-10-06
2,308 commits to main branch, last one 15 hours ago
Frontier is an all-in-one user management platform that provides identity, access and billing management to help organizations secure their systems and data. (Open source alternative to Clerk, WorkOS)
Created
2021-02-26
1,118 commits to main branch, last one 13 hours ago
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:
Created
2022-02-11
130 commits to main branch, last one 2 months ago
Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.
Created
2021-03-22
825 commits to main branch, last one about a year ago
An open source development framework to help you build data workflows and modern data architecture on AWS.
Created
2022-02-16
566 commits to main branch, last one 25 days ago
Stencil is a schema registry that provides schema management and validation dynamically, efficiently, and reliably to ensure data compatibility across applications.
Created
2019-02-16
314 commits to main branch, last one 27 days ago
Raccoon is a high-throughput, low-latency service to collect events in real-time from your web, mobile apps, and services using multiple network protocols.
Created
2021-03-22
213 commits to main branch, last one 6 months ago
Meteor is an easy-to-use, plugin-driven metadata collection framework to extract data from different sources and sink to any data catalog.
Created
2021-03-22
375 commits to main branch, last one 5 months ago