Statistics for topic data-science
RepositoryStats tracks 642,787 Github repositories, of these 2,247 are tagged with the data-science topic. The most common primary language for repositories using this topic is Python (774). Other languages include: Jupyter Notebook (642), R (76), HTML (71), TypeScript (56), JavaScript (54), Go (40), C++ (37), Rust (24), Java (23)
Stargazers over time for topic data-science
Most starred repositories for topic data-science (view more)
Trending repositories for topic data-science (view more)
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. All in a modern, AI-native editor.
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
About The most comprehensive SQL guide from a real-world expert! Learn everything from basics to advanced queries, optimizations, and real-world SQL
This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.
Data science, machine learning books and resources
ML-algorithms from scratch using Python. Classic Machine Learning course.
Visual AI development framework for training and inference of ML models, scaling pipelines, and automating workflows with Python.⭐ Leave a star to support us!
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. All in a modern, AI-native editor.
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
About The most comprehensive SQL guide from a real-world expert! Learn everything from basics to advanced queries, optimizations, and real-world SQL
This repo is meant to serve as a detailed guide for Machine Learning/AI interviews.
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
Data science, machine learning books and resources
If you're coming from one of my data science tutorials, you'll find the code and the links to the tutorials here. I hope you find them helpful. Happy learning and coding!
This repo is meant to serve as a detailed guide for Machine Learning/AI interviews.
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. All in a modern, AI-native editor.
Apache Superset is a Data Visualization and Data Exploration Platform
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
About The most comprehensive SQL guide from a real-world expert! Learn everything from basics to advanced queries, optimizations, and real-world SQL
This class is a broad overview and dive into Exploiting AI and the different attacks that exist, and best practice strategies.
Fast CPU and GPU Python implementations of Improved Kernel PLS by Dayal and MacGregor (1997) and Shortcutting Cross-Validation by Engstrøm (2024).
A curated list of 100+ resources for building and deploying generative AI specifically focusing on helping you become a Generative AI Data Scientist with LLMs
Best Data Science, Data Analytics, AI, and SDE roadmaps. This repository is continually updated based on the top job postings on LinkedIn and Indeed in the data science and AI domain.
An AI-powered data science team of agents to help you perform common data science tasks 10X faster.
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digita...
Curated Data Science resources (Free & Paid) to help aspiring and experienced data scientists learn, grow, and advance their careers.
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. All in a modern, AI-native editor.
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Streamlit — A faster way to build and share data apps.
Apache Superset is a Data Visualization and Data Exploration Platform
2025 AI/ML internship & new graduate job list updated daily
Chat with your data - AI data analysis and visualization on CSV, Postgres, MySQL, Snowflake, SQLite...
Visual Data Transformation and Data Preparation. Low-Code Python-based ETL.
Chat with your data, modify it, visualize it, create and test machine learning models all in plain English. DataHorse makes data analysis and data science conversational using LLMs.
Visualise your CSV files in seconds without sending your data anywhere