Statistics for topic big-data
RepositoryStats tracks 635,692 Github repositories, of these 371 are tagged with the big-data topic. The most common primary language for repositories using this topic is Java (94). Other languages include: Python (60), Scala (32), Jupyter Notebook (27), C++ (22), Rust (18), JavaScript (15), Go (14), TypeScript (14)
Stargazers over time for topic big-data
Most starred repositories for topic big-data (view more)
Trending repositories for topic big-data (view more)
ClickHouse® is a real-time analytics database management system
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AW...
Apache Spark - A unified analytics engine for large-scale data processing
Use CH-UI to work with your data from Click House self-hosted with a user-friendly interface. CH-UI is a modern and feature-rich user interface for ClickHouse databases. It offers an intuitive platfor...
LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive (AI) workloads.
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
ClickHouse® is a real-time analytics database management system
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Apache Spark - A unified analytics engine for large-scale data processing
An open source, standard data file format for graph data storage and retrieval.
Use CH-UI to work with your data from Click House self-hosted with a user-friendly interface. CH-UI is a modern and feature-rich user interface for ClickHouse databases. It offers an intuitive platfor...
CortexBrain is an ambitious open source project aimed at creating an intelligent, lightweight, and efficient service mesh architecture to seamlessly connect cloud and edge devices
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
ClickHouse® is a real-time analytics database management system
Apache Spark - A unified analytics engine for large-scale data processing
CortexBrain is an ambitious open source project aimed at creating an intelligent, lightweight, and efficient service mesh architecture to seamlessly connect cloud and edge devices
Use CH-UI to work with your data from Click House self-hosted with a user-friendly interface. CH-UI is a modern and feature-rich user interface for ClickHouse databases. It offers an intuitive platfor...
High performance data processing employs high performance computing (HPC) to process data, which is then translated into information and knowledge. The advent of high-performance computing and data an...
Use CH-UI to work with your data from Click House self-hosted with a user-friendly interface. CH-UI is a modern and feature-rich user interface for ClickHouse databases. It offers an intuitive platfor...
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
ClickHouse® is a real-time analytics database management system
Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
Apache Spark - A unified analytics engine for large-scale data processing
Bigtop Manager provides a modern, low-threshold web application to simplify the deployment and management of components for Bigtop, similar to Apache Ambari and Cloudera Manager.
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Apache Paimon Rust The rust implementation of Apache Paimon.