Statistics for topic big-data
RepositoryStats tracks 579,129 Github repositories, of these 359 are tagged with the big-data topic. The most common primary language for repositories using this topic is Java (93). Other languages include: Python (56), Scala (30), Jupyter Notebook (28), C++ (21), Rust (16), JavaScript (15), TypeScript (14), Go (13)
Stargazers over time for topic big-data
Most starred repositories for topic big-data (view more)
Trending repositories for topic big-data (view more)
Apache Spark - A unified analytics engine for large-scale data processing
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
Apache Wayang(incubating) is the first cross-platform data processing system.
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
An open source, standard data file format for graph data storage and retrieval.
Apache Spark - A unified analytics engine for large-scale data processing
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
Apache Wayang(incubating) is the first cross-platform data processing system.
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
An open source, standard data file format for graph data storage and retrieval.
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Apache Spark - A unified analytics engine for large-scale data processing
Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
Use CH-UI to work with your data from Click House self-hosted with a user-friendly interface. CH-UI is a modern and feature-rich user interface for ClickHouse databases. It offers an intuitive platfor...
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
A curated list of awesome Online Analytical Processing databases, frameworks, ressources and other awesomeness.
A @ClickHouse fork that supports high-performance vector search and full-text search.
Un repositorio más con conceptos básicos, desafíos técnicos y recursos sobre ingeniería de datos en español 🧙✨
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for...
A @ClickHouse fork that supports high-performance vector search and full-text search.
An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All compone...