36 results found Sort:

4.4k
27.2k
bsd-2-clause
574
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Created 2013-10-28
7,863 commits to master branch, last one a day ago
936
2.2k
apache-2.0
63
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Created 2017-12-18
4,267 commits to master branch, last one a day ago
322
2.0k
mit
84
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Created 2019-04-22
383 commits to main branch, last one 6 days ago
248
1.6k
bsd-3-clause
56
A Scala kernel for Jupyter
Created 2015-03-10
1,631 commits to main branch, last one 29 days ago
474
1.3k
apache-2.0
40
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Created 2021-12-06
4,763 commits to main branch, last one 17 hours ago
762
1.3k
apache-2.0
40
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Created 2019-02-10
185 commits to master branch, last one 2 months ago
电商用户行为分析大数据平台
Created 2018-06-21
45 commits to master branch, last one 5 years ago
140
573
apache-2.0
28
Qubole Sparklens tool for performance tuning Apache Spark
Created 2018-03-16
54 commits to master branch, last one 3 years ago
🐍 Quick reference guide to common patterns & functions in PySpark.
Created 2019-03-07
32 commits to master branch, last one 2 years ago
The Internals of Spark SQL
Created 2017-12-26
1,554 commits to main branch, last one 2 months ago
100
428
bsd-3-clause
16
New Generation Opensource Data Stack Demo
Created 2022-07-03
57 commits to main branch, last one 2 years ago
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsigh...
Created 2019-03-14
961 commits to master branch, last one 2 months ago
28
285
apache-2.0
11
Use SQL to build ELT pipelines on a data lakehouse.
Created 2021-03-11
481 commits to main branch, last one 2 years ago
Apache Spark™ and Scala Workshops
Created 2016-03-10
318 commits to gh-pages branch, last one 2 years ago
21
225
apache-2.0
10
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
Created 2021-09-23
1,106 commits to main branch, last one 2 months ago
Spark Structured Streaming / Kafka / Cassandra / Elastic
Created 2017-06-15
25 commits to master branch, last one 6 years ago
73
182
apache-2.0
17
An encrypted data analytics platform
Created 2016-10-31
675 commits to master branch, last one 2 years ago
Apache Spark 3 - Structured Streaming Course Material
Created 2020-07-21
29 commits to master branch, last one 4 years ago
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
Created 2021-06-27
18 commits to hudi branch, last one 3 years ago
50
113
apache-2.0
32
Spark Connector to read and write with Pulsar
Created 2019-07-01
192 commits to master branch, last one 11 months ago
17
106
apache-2.0
4
Apache Spark Connect Client for Rust
Created 2023-09-18
86 commits to main branch, last one about a month ago
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We wil...
Created 2022-05-10
45 commits to master branch, last one 2 years ago
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Created 2018-03-26
60 commits to master branch, last one 3 years ago
Apache Spark Course Material
Created 2020-05-05
34 commits to master branch, last one 4 years ago
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark. Explore a variety of tutorials and demonstrations on Big Data technologies, primarily in the form of Jupyter notebooks. Most notebooks are s...
Created 2019-08-27
370 commits to master branch, last one 3 months ago
9
65
bsd-3-clause
4
New generation opensource data stack
Created 2022-05-20
8 commits to main branch, last one 2 years ago
bring sf to spark in production
Created 2019-01-11
143 commits to master branch, last one 3 years ago
42
55
unknown
19
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that re...
Created 2016-05-04
191 commits to master branch, last one 3 years ago
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformati...
Created 2019-11-16
15 commits to master branch, last one about a year ago
尚硅谷大数据Spark-2019版最新 Spark 学习
Created 2019-08-24
39 commits to master branch, last one 2 years ago