Trending repositories for topic bigdata
This is a repo with links to everything you'd ever want to learn about data engineering
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
A curated list of awesome big data frameworks, ressources and other awesomeness.
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
IT Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP etc - gradually porting my large private knowledge base to public
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underl...
Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
IT Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP etc - gradually porting my large private knowledge base to public
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underl...
Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️
This is a repo with links to everything you'd ever want to learn about data engineering
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Google, Naver multiprocess image web crawler (Selenium)
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
A curated list of awesome big data frameworks, ressources and other awesomeness.
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
This is a repo with links to everything you'd ever want to learn about data engineering
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
A curated list of awesome big data frameworks, ressources and other awesomeness.
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
A cross-platform Echarts dashboard application,Powerpoint-like, designed based on Excel data, with the capability to update data remotely.supports line, spline, area, areaspline, column, bar, pie, sca...
Hydra九头龙,保姆级为您打造属于您的造跨平台TB-PB级别专属搜索引擎、专属上帝之眼。Hydra-面向云计算、多任务调度、MapReduce、通信、数仓、微服务化、抽象化分布式操作系统——以实现小型爬虫搜索引擎为例。
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underl...
Possibly the fastest DataFrame-agnostic quality check library in town.
A cross-platform Echarts dashboard application,Powerpoint-like, designed based on Excel data, with the capability to update data remotely.supports line, spline, area, areaspline, column, bar, pie, sca...
Hydra九头龙,保姆级为您打造属于您的造跨平台TB-PB级别专属搜索引擎、专属上帝之眼。Hydra-面向云计算、多任务调度、MapReduce、通信、数仓、微服务化、抽象化分布式操作系统——以实现小型爬虫搜索引擎为例。
IT Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP etc - gradually porting my large private knowledge base to public
Possibly the fastest DataFrame-agnostic quality check library in town.
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
A full big data pipeline (Lambda Architecture) with Spark, Kafka, HDFS and Cassandra.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underl...
This is a repo with links to everything you'd ever want to learn about data engineering
This Repository consists of Assignments and projects of the iNeuron Full Stack Data Science Course
数据建设与大数据技术知识体系,包含hadoop、hive、spark、flink主流框架和系列框架,数据中台、数据湖、数据治理、数仓建设、数据化转型等
AthenaCLI is a CLI tool for AWS Athena service that can do auto-completion and syntax highlighting.
This is a repo with links to everything you'd ever want to learn about data engineering
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
A curated list of awesome big data frameworks, ressources and other awesomeness.
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
A cross-platform Echarts dashboard application,Powerpoint-like, designed based on Excel data, with the capability to update data remotely.supports line, spline, area, areaspline, column, bar, pie, sca...
Hydra九头龙,保姆级为您打造属于您的造跨平台TB-PB级别专属搜索引擎、专属上帝之眼。Hydra-面向云计算、多任务调度、MapReduce、通信、数仓、微服务化、抽象化分布式操作系统——以实现小型爬虫搜索引擎为例。
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
Possibly the fastest DataFrame-agnostic quality check library in town.
Hydra九头龙,保姆级为您打造属于您的造跨平台TB-PB级别专属搜索引擎、专属上帝之眼。Hydra-面向云计算、多任务调度、MapReduce、通信、数仓、微服务化、抽象化分布式操作系统——以实现小型爬虫搜索引擎为例。
A cross-platform Echarts dashboard application,Powerpoint-like, designed based on Excel data, with the capability to update data remotely.supports line, spline, area, areaspline, column, bar, pie, sca...
✏️[计算机基础+java基础+大数据基础及进阶+面试指南] 一份涵盖计算机基础,java,大数据,面试宝典,大部分核心知识的项目,学习,面试,共同进步!
IT Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP etc - gradually porting my large private knowledge base to public
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
Possibly the fastest DataFrame-agnostic quality check library in town.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
This Repo contain details related to Data Engineering tech stacks in GCP
数据建设与大数据技术知识体系,包含hadoop、hive、spark、flink主流框架和系列框架,数据中台、数据湖、数据治理、数仓建设、数据化转型等
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underl...
This is a repo with links to everything you'd ever want to learn about data engineering
This is a repo with links to everything you'd ever want to learn about data engineering
A cross-platform Echarts dashboard application,Powerpoint-like, designed based on Excel data, with the capability to update data remotely.supports line, spline, area, areaspline, column, bar, pie, sca...
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
Hydra九头龙,保姆级为您打造属于您的造跨平台TB-PB级别专属搜索引擎、专属上帝之眼。Hydra-面向云计算、多任务调度、MapReduce、通信、数仓、微服务化、抽象化分布式操作系统——以实现小型爬虫搜索引擎为例。
ThereForYou: Your mental health ally. Kai, our AI assistant, offers compassionate support. Track your mood trends, find solace in a secure community, and access crisis resources swiftly. We're here to...
IT Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP etc - gradually porting my large private knowledge base to public
This is a repo with links to everything you'd ever want to learn about data engineering
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
A curated list of awesome big data frameworks, ressources and other awesomeness.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.
This is a repo with links to everything you'd ever want to learn about data engineering
Possibly the fastest DataFrame-agnostic quality check library in town.
数据建设与大数据技术知识体系,包含hadoop、hive、spark、flink主流框架和系列框架,数据中台、数据湖、数据治理、数仓建设、数据化转型等
This Repo contain details related to Data Engineering tech stacks in GCP
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
An end to end demo of Google's Cloud data and analytic stack.
The vehicle orientation dataset is a large-scale dataset containing more than one million annotations for vehicle detection with simultaneous orientation classification using a standard object detecti...
✏️[计算机基础+java基础+大数据基础及进阶+面试指南] 一份涵盖计算机基础,java,大数据,面试宝典,大部分核心知识的项目,学习,面试,共同进步!
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
📚 本静态博客是作者冰河多年从事多年互联网大厂开发、架构的学习历程技术汇总,旨在为大家提供一个清晰详细的学习教程,侧重点更倾向编写Java核心内容、底层原理、架构知识、渗透技术。如果本仓库能为您提供帮助,请给予支持(关注、点赞、分享)!
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.