Trending repositories for topic data-analytics
Apache Superset is a Data Visualization and Data Exploration Platform
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digita...
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
A curated list of awesome big data frameworks, ressources and other awesomeness.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
This real-time project integrates flight information from the AviationStack API for DFW Airport and weather data from the National Weather Service API, to provide the latest arrival, departure, and fo...
I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
This real-time project integrates flight information from the AviationStack API for DFW Airport and weather data from the National Weather Service API, to provide the latest arrival, departure, and fo...
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digita...
I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Apache Superset is a Data Visualization and Data Exploration Platform
A curated list of awesome big data frameworks, ressources and other awesomeness.
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Apache Superset is a Data Visualization and Data Exploration Platform
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digita...
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
A curated list of awesome big data frameworks, ressources and other awesomeness.
I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
Datart is a next generation Data Visualization Open Platform
This real-time project integrates flight information from the AviationStack API for DFW Airport and weather data from the National Weather Service API, to provide the latest arrival, departure, and fo...
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
This real-time project integrates flight information from the AviationStack API for DFW Airport and weather data from the National Weather Service API, to provide the latest arrival, departure, and fo...
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digita...
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Datart is a next generation Data Visualization Open Platform
This real-time project integrates flight information from the AviationStack API for DFW Airport and weather data from the National Weather Service API, to provide the latest arrival, departure, and fo...
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Apache Superset is a Data Visualization and Data Exploration Platform
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
A curated list of awesome big data frameworks, ressources and other awesomeness.
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digita...
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Datart is a next generation Data Visualization Open Platform
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
This real-time project integrates flight information from the AviationStack API for DFW Airport and weather data from the National Weather Service API, to provide the latest arrival, departure, and fo...
This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.
I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge
This real-time project integrates flight information from the AviationStack API for DFW Airport and weather data from the National Weather Service API, to provide the latest arrival, departure, and fo...
This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Discover a curated collection of dynamic Power BI dashboards covering financial analytics, HR metrics, streaming service trends, real estate dynamics, and more. Meticulously designed for comprehensive...
ANJANA is a Python library for anonymizing sensitive data
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
A curated list of open source tools used in analytics platforms and data engineering ecosystem
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digita...
A configuration-driven framework for building Dagster pipelines that enables teams to create and manage data workflows using YAML/JSON instead of code
Deploys a Lakehouse Architecture Solution
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digita...
A roadmap to guide you through mastering SQL for Data Science in just 6 weeks for free
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.
ANJANA is a Python library for anonymizing sensitive data
A configuration-driven framework for building Dagster pipelines that enables teams to create and manage data workflows using YAML/JSON instead of code
This real-time project integrates flight information from the AviationStack API for DFW Airport and weather data from the National Weather Service API, to provide the latest arrival, departure, and fo...
dpq is an open-source python library that makes prompt-based data transformations and feature engineering easy
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Apache Superset is a Data Visualization and Data Exploration Platform
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digita...
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
A curated list of awesome big data frameworks, ressources and other awesomeness.
Datart is a next generation Data Visualization Open Platform
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge
A roadmap to guide you through mastering SQL for Data Science in just 6 weeks for free
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...
A roadmap to guide you through mastering SQL for Data Science in just 6 weeks for free
This real-time project integrates flight information from the AviationStack API for DFW Airport and weather data from the National Weather Service API, to provide the latest arrival, departure, and fo...
A configuration-driven framework for building Dagster pipelines that enables teams to create and manage data workflows using YAML/JSON instead of code
This project aims to predict the magnitude and probability of Earthquake occurring in a particular region using the historic data with various machine learning models to find which model is more accur...
Hands on lab for Neo4j and Amazon Bedrock
This repository contains a SQL dataset of a music store and SQL queries to answer questions about the data. The results of the SQL queries can be found in the analysis.sql file. This repository can b...
🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.
Using a combination of Excel, SQL, and Tableau, I delved into the extensive datasets comprising over 82k rows of data from Netflix's shows and movies library. Through data simplification and analysis,...
Deploys a Lakehouse Architecture Solution