Trending repositories for topic data-analytics
Apache Superset is a Data Visualization and Data Exploration Platform
AI-data warehouse to enrich, transform and analyze unstructured data
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, da...
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
A curated list of awesome big data frameworks, ressources and other awesomeness.
ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identifi...
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
A curated list of awesome blogs, videos, tools and resources about Data Contracts
Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
AI-data warehouse to enrich, transform and analyze unstructured data
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, da...
A curated list of awesome blogs, videos, tools and resources about Data Contracts
ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identifi...
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
Apache Superset is a Data Visualization and Data Exploration Platform
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
A curated list of awesome big data frameworks, ressources and other awesomeness.
Apache Superset is a Data Visualization and Data Exploration Platform
AI-data warehouse to enrich, transform and analyze unstructured data
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, da...
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
A curated list of awesome big data frameworks, ressources and other awesomeness.
Datart is a next generation Data Visualization Open Platform
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Portfolio of data science and data analyst projects completed by me for academic, self learning, and hobby purposes.
🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥
Graph-indexed Pandas DataFrames for analyzing hierarchical performance data
Discover a curated collection of dynamic Power BI dashboards covering financial analytics, HR metrics, streaming service trends, real estate dynamics, and more. Meticulously designed for comprehensive...
AI-data warehouse to enrich, transform and analyze unstructured data
Hands on lab for Neo4j and Amazon Bedrock
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Portfolio of data science and data analyst projects completed by me for academic, self learning, and hobby purposes.
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, da...
🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥
A curated list of awesome blogs, videos, tools and resources about Data Contracts
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
A roadmap to guide you through mastering SQL for Data Science in just 6 weeks for free
Data Analysis Using Python: A Beginner’s Guide Featuring NYC Open Data.
AI-data warehouse to enrich, transform and analyze unstructured data
Apache Superset is a Data Visualization and Data Exploration Platform
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
A roadmap to guide you through mastering SQL for Data Science in just 6 weeks for free
A curated list of awesome big data frameworks, ressources and other awesomeness.
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, da...
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Datart is a next generation Data Visualization Open Platform
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
AI-data warehouse to enrich, transform and analyze unstructured data
A roadmap to guide you through mastering SQL for Data Science in just 6 weeks for free
Discover a curated collection of dynamic Power BI dashboards covering financial analytics, HR metrics, streaming service trends, real estate dynamics, and more. Meticulously designed for comprehensive...
A curated list of open source tools used in analytics platforms and data engineering ecosystem
This project aims to predict the magnitude and probability of Earthquake occurring in a particular region using the historic data with various machine learning models to find which model is more accur...
Using a combination of Excel, SQL, and Tableau, I delved into the extensive datasets comprising over 82k rows of data from Netflix's shows and movies library. Through data simplification and analysis,...
Hands on lab for Neo4j and Amazon Bedrock
🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥
Graph-indexed Pandas DataFrames for analyzing hierarchical performance data
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
collection of SQL - Tableau integration projects for Data Analytics and Business Intelligence
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, da...
Kickstart AI through Machine Learning and Deep Learning Projects (20+)
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
AI-data warehouse to enrich, transform and analyze unstructured data
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, da...
A roadmap to guide you through mastering SQL for Data Science in just 6 weeks for free
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
Scraping Wikipedia by combining LangChain's agents and tools with OpenAI's LLMs and function calling
Apache Superset is a Data Visualization and Data Exploration Platform
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
AI-data warehouse to enrich, transform and analyze unstructured data
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckD...
A curated list of awesome big data frameworks, ressources and other awesomeness.
AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, da...
Datart is a next generation Data Visualization Open Platform
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...
I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge
A roadmap to guide you through mastering SQL for Data Science in just 6 weeks for free
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
A curated list of open source tools used in analytics platforms and data engineering ecosystem
MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...
Using a combination of Excel, SQL, and Tableau, I delved into the extensive datasets comprising over 82k rows of data from Netflix's shows and movies library. Through data simplification and analysis,...
Hands on lab for Neo4j and Amazon Bedrock
A roadmap to guide you through mastering SQL for Data Science in just 6 weeks for free
Contenido y material practico de cada clase.
This project aims to predict the magnitude and probability of Earthquake occurring in a particular region using the historic data with various machine learning models to find which model is more accur...
🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥
Deploys a Lakehouse Architecture Solution
Real-time explainable machine learning for business optimisation
Kickstart AI through Machine Learning and Deep Learning Projects (20+)
A simple package to abstract away the process of creating usable DataFrames for data analytics. This package is heavily inspired by the amazing Python library, Pandas.
A library for learners! Whether or not you're a part of AWS Cloud Clubs, take a look in this library for free, open, leveled content for students 18+ worldwide
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.