5 results found Sort:

MapReduce, Spark, Java, and Scala for Data Algorithms Book
Created 2014-08-06
787 commits to master branch, last one about a month ago
Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.
Created 2022-08-17
1 commits to main branch, last one 2 years ago
Hadoop, MapReduce Distributed Crawling of Data Information from All Chinese Universities.
Created 2023-04-10
77 commits to master branch, last one about a month ago
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
Created 2019-08-27
351 commits to master branch, last one 13 days ago
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformati...
Created 2019-11-16
15 commits to master branch, last one about a year ago