16 results found Sort:

19
467
postgresql
4
DuckDB-powered data lake analytics from Postgres
Created 2024-05-09
103 commits to dev branch, last one 3 days ago
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POC...
Created 2019-07-23
300 commits to master branch, last one about a month ago
85
386
apache-2.0
26
A highly efficient daemon for streaming data from Kafka into Delta Lake
Created 2021-02-05
267 commits to main branch, last one 11 days ago
44
315
mit
15
Delta Lake helper methods in PySpark
Created 2022-11-26
74 commits to main branch, last one 5 months ago
The Internals of Delta Lake
Created 2019-10-30
677 commits to main branch, last one 22 days ago
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Created 2019-08-07
2,029 commits to develop-spark3 branch, last one 11 days ago
1
115
gpl-3.0
1
a lightweight, comprehensive solution for managing delta tables built on polars and deltalake
Created 2024-08-10
5 commits to master branch, last one about a month ago
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
Created 2021-06-27
18 commits to hudi branch, last one 2 years ago
44
103
apache-2.0
20
Streaming application development and management system, based on Linkis and DSS, planning to provide the workflow-like graphical drag-and-drop development capability.
Created 2021-03-25
932 commits to main branch, last one about a month ago
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We wil...
Created 2022-05-10
45 commits to master branch, last one 2 years ago
This repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of the transaction log for the table.
Created 2021-05-27
23 commits to main branch, last one 2 years ago
Command-line interface to quickly generate fake CSV and JSON data
Created 2023-05-25
39 commits to main branch, last one 6 months ago
27
50
unknown
8
Databricks Platform - Architecture, Security, Automation and much more!!
Created 2019-08-17
326 commits to master branch, last one 26 days ago
This repository has no description...
Created 2024-07-02
9 commits to main branch, last one 7 months ago
9
27
agpl-3.0
2
Collection of AWS Lambdas for creating and managing Delta tables
Created 2023-05-06
182 commits to main branch, last one 2 days ago