Statistics for topic big-data-analytics
RepositoryStats tracks 518,325 Github repositories, of these 15 are tagged with the big-data-analytics topic.
Stargazers over time for topic big-data-analytics
Most starred repositories for topic big-data-analytics (view more)
Trending repositories for topic big-data-analytics (view more)
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
PySpark-Tutorial provides basic algorithms using PySpark
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
PySpark-Tutorial provides basic algorithms using PySpark
Easy Machine Learning is a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real world tasks.
A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.
vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)
PySpark-Tutorial provides basic algorithms using PySpark
A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Course covers big data fundamentals, processes, technologies, platform ecosystem, and management for practical application development.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
PySpark-Tutorial provides basic algorithms using PySpark
vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)
A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.
A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀