27 results found Sort:

118
1.3k
apache-2.0
19
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Created 2021-07-07
825 commits to main branch, last one 4 months ago
37
863
bsd-3-clause-clear
19
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
This repository has been archived (exclude archived)
Created 2020-05-26
516 commits to main branch, last one about a year ago
27
839
apache-2.0
8
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
Created 2023-08-03
2,081 commits to main branch, last one 2 days ago
104
375
apache-2.0
10
Open data platform based on Kubernetes. Scaleph supports SeaTunnel、Flink and Doris backended by SeaTunnel on Flink engine、Flink Kubernetes Operator and Doris operator.
Created 2022-04-23
886 commits to dev branch, last one 24 days ago
50
218
apache-2.0
9
TechUI is a easy to use Dynamic SVG Data Visualization Dashboard development tool, based on vite + vue2 development
Created 2023-02-23
47 commits to main branch, last one 9 months ago
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Created 2024-02-22
21 commits to main branch, last one 18 hours ago
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
Created 2024-03-26
151 commits to main branch, last one about a month ago
Template to deploy the Data Management Zone of Cloud Scale Analytics (former Enterprise-Scale Analytics). The Data Management Zone provides data governance and management capabilities for the data pla...
Created 2020-08-07
650 commits to main branch, last one about a year ago
Template to deploy a single Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Landing Zone is a logical construct and a unit of scale in the a...
Created 2020-08-07
451 commits to main branch, last one about a year ago
ODD Specification is a universal open standard for collecting metadata.
Created 2020-12-16
188 commits to main branch, last one 3 months ago
40
122
gpl-3.0
8
AtroCore is an open-source Data Platform, Data Management and Master Data Management (MDM) software, which can be used to quickly create any business application.
Created 2020-09-24
9,024 commits to master branch, last one 2 days ago
36
100
apache-2.0
8
python ETL framework
Created 2017-09-05
79 commits to master branch, last one 4 years ago
Example repository showing how to build a data platform with Prefect, dbt and Snowflake
Created 2022-10-12
23 commits to main branch, last one about a year ago
11
98
apache-2.0
2
A free, simple, and easy-to-use technology-style UI component, developed based on vue3
Created 2023-07-01
35 commits to main branch, last one 9 months ago
Protobuf converter plugin for Kafka Connect
This repository has been archived (exclude archived)
Created 2018-01-03
44 commits to master branch, last one about a year ago
Template to deploy a Data Product for analytics and data science use-cases into a Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Product te...
Created 2020-08-12
207 commits to main branch, last one about a year ago
Graviti TensorBay Python SDK
Created 2021-02-24
1,213 commits to main branch, last one about a year ago
Banco de Dados para Estudo
Created 2022-12-24
30 commits to main branch, last one about a year ago
End to end data engineering project
Created 2022-03-20
13 commits to main branch, last one 2 years ago
Shoonya - Platform to Annotate and label data at scale.
Created 2022-05-17
67 commits to master branch, last one 4 months ago
Open-source metadata collector based on ODD Specification
This repository has been archived (exclude archived)
Created 2022-02-10
259 commits to main branch, last one about a year ago
Guide to data platforms and tools
Created 2022-03-09
2 commits to main branch, last one 2 years ago
This repo demonstrate a comprehensive modern data stack using popular open-source tools.
Created 2023-09-01
26 commits to main branch, last one about a year ago
🧮 Open, serverless, and local friendly Data Platform for the Filecoin Ecosystem
Created 2023-06-13
456 commits to main branch, last one 4 days ago
End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore, Minio, Postgres)
Created 2024-08-08
108 commits to main branch, last one 3 months ago
0
21
apache-2.0
3
PawMark is a platform for developers to build, schedule and monitor data pipelines.
Created 2023-10-04
387 commits to release-0.6.1 branch, last one about a month ago