Trending repositories for topic bigdata
This is a repo with links to everything you'd ever want to learn about data engineering
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, Industrial IoT and DevOps.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
A curated list of awesome big data frameworks, ressources and other awesomeness.
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
An end to end demo of Google's Cloud data and analytic stack.
🔨 用 JSON 来生成结构化的 SQL 语句,基于 Vue3 + TypeScript + Vite + Ant Design + MonacoEditor 实现,项目简单(重逻辑轻页面)、适合练手~
🔥🔥🔥 📚 本代码库是作者冰河多年从事互联网大厂开发、架构的学习历程技术汇总,旨在为大家提供一个清晰详细的学习教程,侧重点更倾向编写Java核心内容、底层原理、架构知识、渗透技术。如果本仓库能为您提供帮助,请给予支持(关注、点赞、分享)!
An end to end demo of Google's Cloud data and analytic stack.
This is a repo with links to everything you'd ever want to learn about data engineering
🔥🔥🔥 📚 本代码库是作者冰河多年从事互联网大厂开发、架构的学习历程技术汇总,旨在为大家提供一个清晰详细的学习教程,侧重点更倾向编写Java核心内容、底层原理、架构知识、渗透技术。如果本仓库能为您提供帮助,请给予支持(关注、点赞、分享)!
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
Datafaker is a large-scale test data and flow test data generation tool. Datafaker fakes data and inserts to varied data sources. 测试数据生成工具
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
Google, Naver multiprocess image web crawler (Selenium)
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
This is a repo with links to everything you'd ever want to learn about data engineering
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
A curated list of awesome big data frameworks, ressources and other awesomeness.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, Industrial IoT and DevOps.
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
TensorBase is a new big data warehousing with modern efforts.
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
An end to end demo of Google's Cloud data and analytic stack.
This is a repo with links to everything you'd ever want to learn about data engineering
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
ThereForYou: Your mental health ally. Kai, our AI assistant, offers compassionate support. Track your mood trends, find solace in a secure community, and access crisis resources swiftly. We're here to...
An end to end demo of Google's Cloud data and analytic stack.
A cross-platform Echarts dashboard application, designed based on Excel data, with the capability to update data remotely.supports line, spline, area, areaspline, column, bar, pie, scatter, angular ga...
Meteor is an easy-to-use, plugin-driven metadata collection framework to extract data from different sources and sink to any data catalog.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
🔥🔥🔥 📚 本代码库是作者冰河多年从事互联网大厂开发、架构的学习历程技术汇总,旨在为大家提供一个清晰详细的学习教程,侧重点更倾向编写Java核心内容、底层原理、架构知识、渗透技术。如果本仓库能为您提供帮助,请给予支持(关注、点赞、分享)!
Google, Naver multiprocess image web crawler (Selenium)
GUI-based Python code generator for data science, extension to Jupyter Lab, Jupyter Notebook and Google Colab.
This is a repo with links to everything you'd ever want to learn about data engineering
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, Industrial IoT and DevOps.
A curated list of awesome big data frameworks, ressources and other awesomeness.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
ThereForYou: Your mental health ally. Kai, our AI assistant, offers compassionate support. Track your mood trends, find solace in a secure community, and access crisis resources swiftly. We're here to...
A cross-platform Echarts dashboard application, designed based on Excel data, with the capability to update data remotely.supports line, spline, area, areaspline, column, bar, pie, scatter, angular ga...
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
Google, Naver multiprocess image web crawler (Selenium)
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
A cross-platform Echarts dashboard application, designed based on Excel data, with the capability to update data remotely.supports line, spline, area, areaspline, column, bar, pie, scatter, angular ga...
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
This is a repo with links to everything you'd ever want to learn about data engineering
Possibly the fastest DataFrame-agnostic quality check library in town.
An end to end demo of Google's Cloud data and analytic stack.
数据建设与大数据技术知识体系,包含hadoop、hive、spark、flink主流框架和系列框架,数据中台、数据湖、数据治理、数仓建设、数据化转型等
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
ERD Online is an online collaborative data warehouse design software. It does not need to install applications locally and operate databases online. It is an excellent alternative to desktop data mode...
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underl...
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
This is a repo with links to everything you'd ever want to learn about data engineering
A cross-platform Echarts dashboard application, designed based on Excel data, with the capability to update data remotely.supports line, spline, area, areaspline, column, bar, pie, scatter, angular ga...
KDP(Kubernetes Data Platform) delivers a modern, hybrid and cloud-native data platform based on Kubernetes.
ThereForYou: Your mental health ally. Kai, our AI assistant, offers compassionate support. Track your mood trends, find solace in a secure community, and access crisis resources swiftly. We're here to...
This is a repo with links to everything you'd ever want to learn about data engineering
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, Industrial IoT and DevOps.
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
A curated list of awesome big data frameworks, ressources and other awesomeness.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.
This is a repo with links to everything you'd ever want to learn about data engineering
Possibly the fastest DataFrame-agnostic quality check library in town.
数据建设与大数据技术知识体系,包含hadoop、hive、spark、flink主流框架和系列框架,数据中台、数据湖、数据治理、数仓建设、数据化转型等
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underl...
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
An end to end demo of Google's Cloud data and analytic stack.
This Repo contain details related to Data Engineering tech stacks in GCP
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
A time series database for storing and managing large amounts of blob data
The vehicle orientation dataset is a large-scale dataset containing more than one million annotations for vehicle detection with simultaneous orientation classification using a standard object detecti...
A free, open-source, web-based self-service BI tailor-made for clickhouse, google bigquery, mysql, postgresql, vertica
This Repository consists of Assignments and projects of the iNeuron Full Stack Data Science Course