Statistics for topic datasets

RepositoryStats tracks 627,864 Github repositories, of these 373 are tagged with the datasets topic. The most common primary language for repositories using this topic is Python (160). Other languages include: Jupyter Notebook (40)

Stargazers over time for topic datasets

90908080707060605050404030302020101000202020202021202120222022202320232024202420252025

Most starred repositories for topic datasets (view more)

A topic-centric list of HQ open datasets.
Created 2014-11-20
781 commits to master branch, last one 4 months ago
2.6k
21.2k
apache-2.0
181
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Created 2019-06-19
4,514 commits to develop branch, last one a day ago
2.8k
19.8k
apache-2.0
276
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Created 2020-03-26
4,005 commits to main branch, last one a day ago
1.4k
12.0k
apache-2.0
1.1k
pix2code: Generating Code from a Graphical User Interface Screenshot
Created 2017-05-24
26 commits to master branch, last one 4 years ago
2.1k
10.9k
mit
216
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Created 2019-10-01
9,005 commits to main branch, last one a day ago
800
10.2k
agpl-3.0
86
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Created 2018-05-11
1,770 commits to master branch, last one 3 days ago

Trending repositories for topic datasets (view more)