36 results found Sort:

4.4k
26.3k
bsd-2-clause
579
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Created 2013-10-28
7,827 commits to master branch, last one a day ago
916
2.1k
apache-2.0
62
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Created 2017-12-18
4,156 commits to master branch, last one 2 days ago
315
2.0k
mit
93
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Created 2019-04-22
372 commits to main branch, last one about a year ago
238
1.6k
bsd-3-clause
56
A Scala kernel for Jupyter
Created 2015-03-10
1,582 commits to main branch, last one 13 days ago
434
1.2k
apache-2.0
41
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Created 2021-12-06
4,099 commits to main branch, last one 5 hours ago
727
1.2k
apache-2.0
41
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Created 2019-02-10
184 commits to master branch, last one 2 years ago
电商用户行为分析大数据平台
Created 2018-06-21
45 commits to master branch, last one 5 years ago
138
568
apache-2.0
30
Qubole Sparklens tool for performance tuning Apache Spark
Created 2018-03-16
54 commits to master branch, last one 3 years ago
The Internals of Spark SQL
Created 2017-12-26
1,545 commits to main branch, last one 2 months ago
🐍 Quick reference guide to common patterns & functions in PySpark.
Created 2019-03-07
32 commits to master branch, last one about a year ago
93
407
bsd-3-clause
16
New Generation Opensource Data Stack Demo
Created 2022-07-03
57 commits to main branch, last one about a year ago
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsigh...
Created 2019-03-14
960 commits to master branch, last one about a month ago
28
285
apache-2.0
12
Use SQL to build ELT pipelines on a data lakehouse.
Created 2021-03-11
481 commits to main branch, last one 2 years ago
Apache Spark™ and Scala Workshops
Created 2016-03-10
318 commits to gh-pages branch, last one 2 years ago
19
213
apache-2.0
8
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
Created 2021-09-23
1,082 commits to main branch, last one 14 hours ago
Spark Structured Streaming / Kafka / Cassandra / Elastic
Created 2017-06-15
25 commits to master branch, last one 6 years ago
73
180
apache-2.0
17
An encrypted data analytics platform
Created 2016-10-31
675 commits to master branch, last one about a year ago
Apache Spark 3 - Structured Streaming Course Material
Created 2020-07-21
29 commits to master branch, last one 4 years ago
49
112
apache-2.0
35
Spark Connector to read and write with Pulsar
Created 2019-07-01
192 commits to master branch, last one 6 months ago
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
Created 2021-06-27
18 commits to hudi branch, last one 2 years ago
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Created 2018-03-26
60 commits to master branch, last one 3 years ago
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We wil...
Created 2022-05-10
45 commits to master branch, last one 2 years ago
Apache Spark Connect Client for Rust
Created 2023-09-18
81 commits to main branch, last one 6 days ago
Apache Spark Course Material
Created 2020-05-05
34 commits to master branch, last one 4 years ago
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
Created 2019-08-27
350 commits to master branch, last one 15 days ago
8
61
bsd-3-clause
4
New generation opensource data stack
Created 2022-05-20
8 commits to main branch, last one 2 years ago
bring sf to spark in production
Created 2019-01-11
143 commits to master branch, last one 2 years ago
42
55
unknown
19
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that re...
Created 2016-05-04
191 commits to master branch, last one 2 years ago
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformati...
Created 2019-11-16
15 commits to master branch, last one about a year ago
尚硅谷大数据Spark-2019版最新 Spark 学习
Created 2019-08-24
39 commits to master branch, last one 2 years ago