36 results found Sort:

4.4k
26.7k
bsd-2-clause
578
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Created 2013-10-28
7,838 commits to master branch, last one 2 days ago
921
2.1k
apache-2.0
63
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Created 2017-12-18
4,203 commits to master branch, last one 3 days ago
316
2.0k
mit
89
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Created 2019-04-22
377 commits to main branch, last one 10 days ago
241
1.6k
bsd-3-clause
56
A Scala kernel for Jupyter
Created 2015-03-10
1,583 commits to main branch, last one about a month ago
446
1.2k
apache-2.0
41
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Created 2021-12-06
4,374 commits to main branch, last one 2 days ago
741
1.2k
apache-2.0
41
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Created 2019-02-10
184 commits to master branch, last one 3 years ago
电商用户行为分析大数据平台
Created 2018-06-21
45 commits to master branch, last one 5 years ago
138
567
apache-2.0
30
Qubole Sparklens tool for performance tuning Apache Spark
Created 2018-03-16
54 commits to master branch, last one 3 years ago
🐍 Quick reference guide to common patterns & functions in PySpark.
Created 2019-03-07
32 commits to master branch, last one about a year ago
The Internals of Spark SQL
Created 2017-12-26
1,552 commits to main branch, last one 29 days ago
95
417
bsd-3-clause
16
New Generation Opensource Data Stack Demo
Created 2022-07-03
57 commits to main branch, last one about a year ago
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsigh...
Created 2019-03-14
960 commits to master branch, last one 2 months ago
28
285
apache-2.0
12
Use SQL to build ELT pipelines on a data lakehouse.
Created 2021-03-11
481 commits to main branch, last one 2 years ago
Apache Spark™ and Scala Workshops
Created 2016-03-10
318 commits to gh-pages branch, last one 2 years ago
20
221
apache-2.0
12
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
Created 2021-09-23
1,103 commits to main branch, last one 13 days ago
Spark Structured Streaming / Kafka / Cassandra / Elastic
Created 2017-06-15
25 commits to master branch, last one 6 years ago
74
180
apache-2.0
18
An encrypted data analytics platform
Created 2016-10-31
675 commits to master branch, last one about a year ago
Apache Spark 3 - Structured Streaming Course Material
Created 2020-07-21
29 commits to master branch, last one 4 years ago
50
114
apache-2.0
35
Spark Connector to read and write with Pulsar
Created 2019-07-01
192 commits to master branch, last one 8 months ago
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
Created 2021-06-27
18 commits to hudi branch, last one 2 years ago
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We wil...
Created 2022-05-10
45 commits to master branch, last one 2 years ago
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Created 2018-03-26
60 commits to master branch, last one 3 years ago
Apache Spark Connect Client for Rust
Created 2023-09-18
83 commits to main branch, last one 13 days ago
Apache Spark Course Material
Created 2020-05-05
34 commits to master branch, last one 4 years ago
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
Created 2019-08-27
366 commits to master branch, last one 27 days ago
8
62
bsd-3-clause
4
New generation opensource data stack
Created 2022-05-20
8 commits to main branch, last one 2 years ago
bring sf to spark in production
Created 2019-01-11
143 commits to master branch, last one 3 years ago
42
55
unknown
19
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that re...
Created 2016-05-04
191 commits to master branch, last one 2 years ago
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformati...
Created 2019-11-16
15 commits to master branch, last one about a year ago
尚硅谷大数据Spark-2019版最新 Spark 学习
Created 2019-08-24
39 commits to master branch, last one 2 years ago