Statistics for topic apache-spark
RepositoryStats tracks 641,701 Github repositories, of these 115 are tagged with the apache-spark topic. The most common primary language for repositories using this topic is Python (29). Other languages include: Scala (28), Jupyter Notebook (11)
Stargazers over time for topic apache-spark
Most starred repositories for topic apache-spark (view more)
Trending repositories for topic apache-spark (view more)
GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs
A curated list of awesome Apache Spark packages and resources.
GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs
A curated list of awesome Apache Spark packages and resources.
lakeFS - Data version control for your data lake | Git for data
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
SQL data analysis & visualization projects using MySQL, PostgreSQL, SQLite, Tableau, Apache Spark and pySpark.
A curated list of awesome Apache Spark packages and resources.
PySpark Tutorial for Beginners - Practical Examples in Jupyter Notebook with Spark version 3.4.1. The tutorial covers various topics like Spark Introduction, Spark Installation, Spark RDD Transformati...
An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All compone...
Code for "Efficient Data Processing in Spark" Course
lakeFS - Data version control for your data lake | Git for data
SQL data analysis & visualization projects using MySQL, PostgreSQL, SQLite, Tableau, Apache Spark and pySpark.
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
A curated list of awesome Apache Spark packages and resources.
A VS Code Extension to make it easier to manage and develop Spark jobs on EMR
RealTime StockStream is a streamlined, simulation system for processing live stock market data. It uses Apache Kafka for data input, Apache Spark for data handling, and Apache Cassandra for data stora...
PySpark Tutorial for Beginners - Practical Examples in Jupyter Notebook with Spark version 3.4.1. The tutorial covers various topics like Spark Introduction, Spark Installation, Spark RDD Transformati...
⛳️ PASS: Amazon Web Services Certified (AWS Certified) Machine Learning Specialty (MLS-C01) by learning based on our Questions & Answers (Q&A) Practice Tests Exams.
lakeFS - Data version control for your data lake | Git for data
SQL data analysis & visualization projects using MySQL, PostgreSQL, SQLite, Tableau, Apache Spark and pySpark.
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Code for "Efficient Data Processing in Spark" Course
Code for "Efficient Data Processing in Spark" Course
⛳️ PASS: Amazon Web Services Certified (AWS Certified) Machine Learning Specialty (MLS-C01) by learning based on our Questions & Answers (Q&A) Practice Tests Exams.
PySpark Tutorial for Beginners - Practical Examples in Jupyter Notebook with Spark version 3.4.1. The tutorial covers various topics like Spark Introduction, Spark Installation, Spark RDD Transformati...
RealTime StockStream is a streamlined, simulation system for processing live stock market data. It uses Apache Kafka for data input, Apache Spark for data handling, and Apache Cassandra for data stora...
An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All compone...