24 results found Sort:

1.2k
3.2k
apache-2.0
39
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
Created 2021-06-09
2,628 commits to dev branch, last one 6 hours ago
79
2.9k
apache-2.0
31
Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes.
Created 2022-07-22
358 commits to main branch, last one about a month ago
124
1.5k
agpl-3.0
14
Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks.
Created 2022-08-31
1,551 commits to main branch, last one 7 months ago
159
546
mpl-2.0
23
从数据仓库到用户画像,从数据建设到数据应用
Created 2020-08-12
56 commits to master branch, last one 2 years ago
130
514
apache-2.0
26
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
Created 2019-09-27
4,037 commits to master branch, last one 5 months ago
131
453
apache-2.0
65
An open-source columnar data format designed for fast & realtime analytic with big data.
Created 2016-12-23
130 commits to master branch, last one 5 years ago
64
416
apache-2.0
11
Free and open source schema versioning and database migration made natively with .NET/6. NEW THIS MAY 2022! v1.3.15 released!
Created 2019-10-06
1,106 commits to master branch, last one 2 years ago
23
225
apache-2.0
15
Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
Created 2021-06-22
617 commits to latest_release branch, last one 2 years ago
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a...
Created 2021-11-23
2,726 commits to main branch, last one 3 months ago
21
149
mit
8
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)
Created 2022-06-22
650 commits to main branch, last one 3 days ago
Code for dbt tutorial
Created 2020-04-25
89 commits to master branch, last one 6 months ago
Hydra九头龙,保姆级为您打造属于您的造跨平台TB-PB级别专属搜索引擎、专属上帝之眼。Hydra-面向云计算、多任务调度、服务通信、数仓、微服务化、抽象化分布式操作系统——以实现小型爬虫搜索引擎为例。
Created 2024-07-10
178 commits to beta branch, last one about a month ago
All of my individual learning materials, documents, and notes from the process of getting the Coursera IBM Data Engineer Professional Certificate specialization are stored in this repository.
Created 2022-12-14
1 commits to main branch, last one about a year ago
End to end data engineering project
Created 2022-03-20
13 commits to main branch, last one 2 years ago
4
51
apache-2.0
2
AlphaSQL provides Integrated Type and Schema Check and Parallelization for SQL file set mainly for BigQuery
Created 2020-05-15
499 commits to master branch, last one 2 years ago
A curated list of awesome Online Analytical Processing databases, frameworks, ressources and other awesomeness.
Created 2023-08-27
4 commits to main branch, last one about a year ago
3
47
apache-2.0
2
A library to accelerate ML and ETL pipeline by connecting all data sources
Created 2023-03-19
217 commits to main branch, last one about a year ago
A DuckDB-powered command line interface for Snowflake security, governance, operations, and cost optimization.
Created 2024-04-04
86 commits to main branch, last one 3 months ago
6
37
apache-2.0
1
Write ETL using your favorite SQL dialects
Created 2022-07-06
80 commits to main branch, last one 11 months ago
Template to perform CI/CD for Microsoft Fabric Data Warehouses
Created 2023-10-25
9 commits to main branch, last one 2 months ago
一款基于规则的可视化模型构建引擎。支持指标定义,规则定义,多数据源接入,RESTful API 查询
Created 2022-05-19
10 commits to main branch, last one 2 years ago
6
27
apache-2.0
2
Code/Notes for the Data Engineering Zoomcamp by DataTalksClub
Created 2023-01-26
63 commits to main branch, last one about a year ago