20 results found Sort:

1.0k
2.9k
apache-2.0
38
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
Created 2021-06-09
2,413 commits to dev branch, last one 22 hours ago
65
2.7k
apache-2.0
32
Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes.
Created 2022-07-22
354 commits to main branch, last one 10 days ago
115
1.5k
agpl-3.0
13
Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks.
Created 2022-08-31
1,551 commits to main branch, last one about a month ago
153
503
mpl-2.0
23
从数据仓库到用户画像,从数据建设到数据应用
Created 2020-08-12
56 commits to master branch, last one 2 years ago
113
466
apache-2.0
28
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
Created 2019-09-27
3,956 commits to master branch, last one 16 days ago
138
451
apache-2.0
65
An open-source columnar data format designed for fast & realtime analytic with big data.
Created 2016-12-23
130 commits to master branch, last one 4 years ago
62
413
apache-2.0
11
Free and open source schema versioning and database migration made natively with .NET/6. NEW THIS MAY 2022! v1.3.15 released!
Created 2019-10-06
1,106 commits to master branch, last one 2 years ago
23
210
apache-2.0
15
Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
Created 2021-06-22
617 commits to latest_release branch, last one 2 years ago
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a...
Created 2021-11-23
2,707 commits to main branch, last one 7 months ago
11
123
mit
6
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)
Created 2022-06-22
504 commits to main branch, last one 18 hours ago
Code for dbt tutorial
Created 2020-04-25
63 commits to master branch, last one 7 days ago
All of my individual learning materials, documents, and notes from the process of getting the Coursera IBM Data Engineer Professional Certificate specialization are stored in this repository.
Created 2022-12-14
1 commits to main branch, last one about a year ago
4
48
apache-2.0
2
AlphaSQL provides Integrated Type and Schema Check and Parallelization for SQL file set mainly for BigQuery
Created 2020-05-15
499 commits to master branch, last one 2 years ago
3
47
apache-2.0
2
A library to accelerate ML and ETL pipeline by connecting all data sources
Created 2023-03-19
217 commits to main branch, last one about a year ago
End to end data engineering project
Created 2022-03-20
13 commits to main branch, last one about a year ago
6
37
apache-2.0
1
Write ETL using your favorite SQL dialects
Created 2022-07-06
80 commits to main branch, last one 4 months ago
一款基于规则的可视化模型构建引擎。支持指标定义,规则定义,多数据源接入,RESTful API 查询
Created 2022-05-19
10 commits to main branch, last one 2 years ago
5
25
apache-2.0
1
Code/Notes for the Data Engineering Zoomcamp by DataTalksClub
Created 2023-01-26
63 commits to main branch, last one about a year ago