16 results found Sort:
- Filter by Primary Language:
- Python (7)
- Rust (3)
- Java (1)
- Jupyter Notebook (1)
- Dockerfile (1)
- Scala (1)
- TSQL (1)
- +
DuckDB-powered data lake analytics from Postgres
Created
2024-05-09
103 commits to dev branch, last one 3 days ago
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POC...
Created
2019-07-23
300 commits to master branch, last one about a month ago
A highly efficient daemon for streaming data from Kafka into Delta Lake
Created
2021-02-05
267 commits to main branch, last one 11 days ago
Delta Lake helper methods in PySpark
Created
2022-11-26
74 commits to main branch, last one 5 months ago
The Internals of Delta Lake
Created
2019-10-30
677 commits to main branch, last one 22 days ago
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Created
2019-08-07
2,029 commits to develop-spark3 branch, last one 11 days ago
a lightweight, comprehensive solution for managing delta tables built on polars and deltalake
Created
2024-08-10
5 commits to master branch, last one about a month ago
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
Created
2021-06-27
18 commits to hudi branch, last one 2 years ago
Streaming application development and management system, based on Linkis and DSS, planning to provide the workflow-like graphical drag-and-drop development capability.
Created
2021-03-25
932 commits to main branch, last one about a month ago
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We wil...
Created
2022-05-10
45 commits to master branch, last one 2 years ago
This repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of the transaction log for the table.
Created
2021-05-27
23 commits to main branch, last one 2 years ago
Command-line interface to quickly generate fake CSV and JSON data
Created
2023-05-25
39 commits to main branch, last one 6 months ago
Databricks Platform - Architecture, Security, Automation and much more!!
Created
2019-08-17
326 commits to master branch, last one 26 days ago
Threat Detection and Visualization
Created
2023-07-28
57 commits to main branch, last one about a year ago
This repository has no description...
Created
2024-07-02
9 commits to main branch, last one 7 months ago