5 results found Sort:
- Filter by Primary Language:
- Java (3)
- Jupyter Notebook (1)
- Python (1)
- +
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Created
2014-08-06
787 commits to master branch, last one 5 months ago
Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.
Created
2022-08-17
1 commits to main branch, last one 2 years ago
Use the MapReduce's Java interface to distributed crawle the data of Chinese universities and learn basic knowledge of hdfs.
Created
2023-04-10
77 commits to master branch, last one 5 months ago
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark. Explore a variety of tutorials and demonstrations on Big Data technologies, primarily in the form of Jupyter notebooks. Most notebooks are s...
Created
2019-08-27
370 commits to master branch, last one 2 months ago
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformati...
Created
2019-11-16
15 commits to master branch, last one about a year ago