Statistics for topic big-data
RepositoryStats tracks 518,986 Github repositories, of these 342 are tagged with the big-data topic. The most common primary language for repositories using this topic is Java (91). Other languages include: Python (55), Scala (29), Jupyter Notebook (24), C++ (21), JavaScript (16), Go (13), Rust (11), TypeScript (11)
Stargazers over time for topic big-data
Most starred repositories for topic big-data (view more)
Trending repositories for topic big-data (view more)
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. ...
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Prod...
An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All compone...
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. ...
Distributed DataFrame for Python designed for the cloud, powered by Rust
An open-source, high-performance SQL vector database built on ClickHouse.
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All compone...
An open-source, high-performance SQL vector database built on ClickHouse.
XL-LightHouse是一套支持超大数据量、支持超高并发的通用型流式大数据统计系统。常见的应用场景包括:PV、UV统计;电商销售额、下单用户数统计;日志量统计;接口调用量、异常量、耗时情况统计;服务器运维指标监控等功能。系统支持多维度统计,支持各种复杂的条件筛选和逻辑判断,一键部署,一行代码接入,轻松实现各种海量数据实时统计,帮助企业以更低的成本快速搭建起数据指标体系,是企业降本增效的好帮手!
An open-source, high-performance SQL vector database built on ClickHouse.
TuGraph Analytics is the fastest OLAP graph database.
Un repositorio más con conceptos básicos, desafíos técnicos y recursos sobre ingeniería de datos en español 🧙✨
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. ...
An open-source, high-performance SQL vector database built on ClickHouse.
XL-LightHouse是一套支持超大数据量、支持超高并发的通用型流式大数据统计系统。常见的应用场景包括:PV、UV统计;电商销售额、下单用户数统计;日志量统计;接口调用量、异常量、耗时情况统计;服务器运维指标监控等功能。系统支持多维度统计,支持各种复杂的条件筛选和逻辑判断,一键部署,一行代码接入,轻松实现各种海量数据实时统计,帮助企业以更低的成本快速搭建起数据指标体系,是企业降本增效的好帮手!
A free, simple, and easy-to-use technology-style UI component, developed based on React