32 results found Sort:

This is a repo with links to everything you'd ever want to learn about data engineering
Created 2023-11-19
171 commits to main branch, last one about a month ago
913
4.7k
apache-2.0
46
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team colla...
Created 2021-08-01
9,690 commits to main branch, last one 19 hours ago
251
2.9k
mit
22
Compare tables within or across databases
This repository has been archived (exclude archived)
Created 2022-03-07
1,932 commits to master branch, last one about a month ago
120
1.5k
apache-2.0
19
Efficient data transformation and modeling framework that is backwards compatible with dbt.
Created 2022-09-23
2,559 commits to main branch, last one 12 hours ago
109
914
agpl-3.0
18
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Created 2021-08-25
1,822 commits to main branch, last one 7 days ago
114
477
apache-2.0
26
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
Created 2019-09-27
4,037 commits to master branch, last one 21 days ago
21
278
other
2
This repository has no description...
Created 2022-06-09
1,033 commits to master branch, last one about a month ago
This repository provides various demos/examples of using Snowpark for Python.
Created 2022-05-26
226 commits to main branch, last one 3 months ago
20
248
apache-2.0
9
An open source development framework to help you build data workflows and modern data architecture on AWS.
Created 2022-02-16
534 commits to main branch, last one 4 days ago
Code and data for the Modern Polars book
Created 2022-12-21
157 commits to master branch, last one 4 days ago
26
128
apache-2.0
13
A Data Platform built for AWS, powered by Kubernetes.
This repository has been archived (exclude archived)
Created 2020-10-08
894 commits to main branch, last one 11 months ago
This repository has no description...
Created 2024-04-05
47 commits to main branch, last one 20 days ago
Data Engineering Pilipinas is a community for data engineers, data analysts, data scientists, developers, AI / ML engineers, and users of closed and open source data tools and methods / techniques in ...
Created 2023-09-05
173 commits to main branch, last one 9 hours ago
Все, о чем меня когда-либо спрашивали на собеседованиях, и другие полезные знания в кратком формате
Created 2021-06-19
212 commits to main branch, last one 2 days ago
Recohut - Learn data engineering, data science
Created 2022-12-30
190 commits to main branch, last one 11 months ago
Index for online reading materials in order to learn Python and backend development/engineering concepts from scratch and develop a mastery sufficient for Senior/Principal Backend Engineers and Data E...
Created 2023-02-05
46 commits to main branch, last one about a year ago
Simple stream processing pipeline
Created 2020-10-03
25 commits to main branch, last one 11 days ago
end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence
Created 2024-01-29
89 commits to main branch, last one 21 hours ago
Resources about data science, machine learning, deep learning, data engineering, and SQL.
Created 2022-10-14
119 commits to main branch, last one 3 months ago
Build, test, deploy, iterate - Dev and prod tool for data science pipelines
Created 2019-03-17
165 commits to master branch, last one 4 years ago
A guide for leading a data (engineering) team
Created 2024-05-07
6 commits to main branch, last one about a month ago
Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.
Created 2022-05-02
55 commits to main branch, last one about a year ago
Duke MIDS: Data Engineering and DataOps Course
Created 2021-07-04
136 commits to main branch, last one about a year ago
Sample project that use Dagster, dbt, DuckDB and Dash to visualize car and motorcycle Spanish market
Created 2022-06-30
45 commits to master branch, last one about a year ago
Apply for a job at Olist's Data Team: https://olist.gupy.io/
Created 2018-09-21
52 commits to master branch, last one 2 years ago
Dockerizing an Apache Spark Standalone Cluster
Created 2021-07-19
35 commits to main branch, last one 2 years ago
Project for "Data pipeline design patterns" blog.
Created 2023-01-19
48 commits to main branch, last one about a year ago
Tutorial on how to setup Trino and Apache Ranger using docker
Created 2021-09-22
11 commits to main branch, last one 4 months ago