Trending repositories for topic dataset

Last 3 days (new repositories)

no newly created repositories trending in the last 3 days

Last 3 days (absolute gain)

public-apis/public-apis

A collective list of free APIs

320,395 (+252)

mit

HumanSignal/label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

19,867 (+59)

apache-2.0

modelscope/data-juicer

Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

3,164 (+34)

apache-2.0

lukas-blecher/LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

13,116 (+22)

mit

cvat-ai/cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

12,844 (+17)

mit

RingBDStack/SocialED

A python library for social event detection

146 (+14)

bsd-2-clause

lonePatient/awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合

4,996 (+11)

mit

rom1504/img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

3,806 (+11)

mit

hyunwoongko/transformer

Transformer: PyTorch Implementation of "Attention Is All You Need"

3,175 (+9)

satellite-image-deep-learning/techniques

Techniques for deep learning with satellite & aerial imagery

8,823 (+9)

apache-2.0

joke2k/faker

Faker is a Python package that generates fake data for you.

17,864 (+9)

mit

ashvardanian/StringZilla

Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖

2,303 (+8)

apache-2.0

microsoft/monitors4codegen

Code and Data artifact for NeurIPS 2023 paper - "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context". `multispy` is a lsp client library in Python intended to be used to bu...

228 (+7)

mit

linhandev/dataset

医学影像数据集列表『An Index for Medical Imaging Datasets』

2,773 (+7)

meodai/color-names

Large list of handpicked color names 🌈

2,452 (+6)

mit

Belval/TextRecognitionDataGenerator

A synthetic data generator for text recognition

3,338 (+6)

mit

WayneMao/RoboMatrix

The Official Implementation of RoboMatrix

60 (+5)

mit

M-3LAB/awesome-industrial-anomaly-detection

Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。

1,730 (+5)

Zjh-819/LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

2,716 (+4)

mit

googlecreativelab/quickdraw-dataset

Documentation on how to access and use the Quick, Draw! Dataset.

6,212 (+4)

Last 3 days (relative gain)

RingBDStack/SocialED

A python library for social event detection

146 (+11%)

bsd-2-clause

WayneMao/RoboMatrix

The Official Implementation of RoboMatrix

60 (+9%)

mit

nalinrajendran/synthetic-LLM-QA-dataset-generator

Create synthetic datasets for training and testing Language Learning Models (LLMs) in a Question-Answering (QA) context.

25 (+4%)

microsoft/monitors4codegen

228 (+3%)

mit

zamanzadeh/ts-anomaly-benchmark

Time-Series Anomaly Detection Comprehensive Benchmark

143 (+1%)

mit

ESA-PhiLab/Major-TOM

Expandable Datasets for Earth Observation

159 (+1%)

Sreyan88/GAMA

Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

92 (+1%)

apache-2.0

modelscope/data-juicer

Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

3,164 (+1%)

apache-2.0

greydanus/mnist1d

A 1D analogue of the MNIST dataset for measuring spatial biases and answering Science of Deep Learning questions.

202 (+1%)

apache-2.0

graspnet/graspnetAPI

Toolbox for our GraspNet-1Billion dataset.

211 (+1.0%)

Vchitect/VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

652 (+0.8%)

apache-2.0

magpie-align/magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!

535 (+0.8%)

mit

Jakaria08/EESRGAN

Small-Object Detection in Remote Sensing (satellite) Images with End-to-End Edge-Enhanced GAN and Object Detector Network

292 (+0.7%)

gpl-3.0

DataDog/malicious-software-packages-dataset

An open-source dataset of malicious software packages found in the wild, 100% vetted by humans.

161 (+0.6%)

apache-2.0

shukkkur/VolleyVision

Applying Deep Learning Approaches to Volleyball Data

183 (+0.5%)

agpl-3.0

a-r-j/ProteinWorkshop

Benchmarking framework for protein representation learning. Includes a large number of pre-training and downstream task datasets, models and training/task utilities. (ICLR 2024)

210 (+0.5%)

mit

lzz19980125/awesome-time-series-segmentation-papers

This repository contains a reading list of papers on Time Series Segmentation. This repository is still being continuously improved.

442 (+0.5%)

gpl-3.0

facebookresearch/stopes

A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.

257 (+0.4%)

mit

M-3LAB/awesome-industrial-anomaly-detection

Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。

1,730 (+0.3%)

thuiar/MMSA

MMSA is a unified framework for Multimodal Sentiment Analysis.

715 (+0.3%)

mit

Last week (new repositories)

no newly created repositories trending in the last week

Last week (absolute gain)

public-apis/public-apis

A collective list of free APIs

320,395 (+579)

mit

HumanSignal/label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

19,867 (+99)

apache-2.0

lukas-blecher/LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

13,116 (+79)

mit

modelscope/data-juicer

Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

3,164 (+63)

apache-2.0

RingBDStack/SocialED

A python library for social event detection

146 (+49)

bsd-2-clause

cvat-ai/cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

12,844 (+38)

mit

satellite-image-deep-learning/techniques

Techniques for deep learning with satellite & aerial imagery

8,823 (+26)

apache-2.0

hyunwoongko/transformer

Transformer: PyTorch Implementation of "Attention Is All You Need"

3,175 (+24)

lonePatient/awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合

4,996 (+21)

mit

rom1504/img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

3,806 (+19)

mit

Zjh-819/LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

2,716 (+17)

mit

M-3LAB/awesome-industrial-anomaly-detection

Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。

1,730 (+16)

WayneMao/RoboMatrix

The Official Implementation of RoboMatrix

60 (+13)

mit

linhandev/dataset

医学影像数据集列表『An Index for Medical Imaging Datasets』

2,773 (+13)

doccano/doccano

Open source annotation tool for machine learning practitioners.

9,654 (+13)

mit

microsoft/monitors4codegen

228 (+11)

mit

NirantK/awesome-project-ideas

Curated list of Machine Learning, NLP, Vision, Recommender Systems Project Ideas

7,964 (+11)

mit

brightmart/nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

9,540 (+11)

mit

magpie-align/magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!

535 (+11)

mit

GanjinZero/awesome_Chinese_medical_NLP

中文医学NLP公开资源整理：术语集/语料库/词向量/预训练模型/知识图谱/命名实体识别/QA/信息抽取/模型/论文/etc

2,223 (+10)

Last week (relative gain)

RingBDStack/SocialED

A python library for social event detection

146 (+51%)

bsd-2-clause

WayneMao/RoboMatrix

The Official Implementation of RoboMatrix

60 (+28%)

mit

cruiseresearchgroup/DIEF_BTS

The Building TimeSeries (BTS) dataset covers three buildings over a three-year period, comprising more than ten thousand timeseries data points with hundreds of unique ontologies. Moreover, the meta...

27 (+23%)

mit

omegalabsinc/omegalabs-bittensor-subnet

The World's Largest Decentralized AGI Multimodal Dataset

39 (+15%)

mit

nalinrajendran/synthetic-LLM-QA-dataset-generator

Create synthetic datasets for training and testing Language Learning Models (LLMs) in a Question-Answering (QA) context.

25 (+14%)

zhenglinpan/AnitaDataset

A free, licensed, and industrial animation dataset

36 (+9%)

Factor-Robotics/Roller-Coaster-SLAM-Dataset

The world's first roller coaster SLAM dataset

107 (+6%)

cc0-1.0

zamanzadeh/ts-anomaly-benchmark

Time-Series Anomaly Detection Comprehensive Benchmark

143 (+6%)

mit

microsoft/monitors4codegen

228 (+5%)

mit

kaymal/acik-veri

Türkiye'nin açık veri kaynakları | Curated list of open data platforms of Turkiye

72 (+4%)

mit

FinnedAI/sportsbookreview-scraper

Sportsbookreview.com scraper + complete 10Y games+odds data for NFL, NBA, NHL, MLB for bettors and sports analysts

27 (+4%)

mit

jonathanwvd/awesome-industrial-datasets

A curated collection of public industrial datasets.

88 (+4%)

Sreyan88/GAMA

Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

92 (+3%)

apache-2.0

FerretAngel/labelImage

一个在线图片数据标注的网页工具

31 (+3%)

apache-2.0

sled-group/3D-GRAND

Official Implementation of 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs

36 (+3%)

rogerioxavier/X-Wines

A world wines dataset with user ratings for recommendation systems and general use.

36 (+3%)

cc0-1.0

mapooon/PetFace

[ECCV 2024 Oral] PetFace: A Large-Scale Dataset and Benchmark for Animal Identification https://arxiv.org/abs/2407.13555

36 (+3%)

agmmnn/turkish-nlp-resources

🔡 List of Tools, Libraries, Models, Datasets and other resources for Turkish NLP.

114 (+3%)

Reza-Zhu/SUES-200-Benchmark

SUES-200: A Multi-height Multi-scene Cross-view Image Benchmark Across Drone and Satellite

40 (+3%)

mit

CHAOZHAO-1/HUSTbearing-dataset

This reposotory release a bearing failure dataset, which can support intelliegnt fault diagnosis research（实验室自采轴承开源数据集，包含稳定转速和时变转速）

43 (+2%)

Last month (new repositories)

RingBDStack/SocialED

A python library for social event detection

146

bsd-2-clause

Last month (absolute gain)

public-apis/public-apis

A collective list of free APIs

320,395 (+2,449)

mit

HumanSignal/label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

19,867 (+462)

apache-2.0

lukas-blecher/LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

13,116 (+369)

mit

cvat-ai/cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

12,844 (+197)

mit

modelscope/data-juicer

Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

3,164 (+193)

apache-2.0

RingBDStack/SocialED

A python library for social event detection

146 (+145)

bsd-2-clause

hyunwoongko/transformer

Transformer: PyTorch Implementation of "Attention Is All You Need"

3,175 (+124)

satellite-image-deep-learning/techniques

Techniques for deep learning with satellite & aerial imagery

8,823 (+109)

apache-2.0

lonePatient/awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合

4,996 (+94)

mit

M-3LAB/awesome-industrial-anomaly-detection

Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。

1,730 (+87)

rom1504/img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

3,806 (+78)

mit

linhandev/dataset

医学影像数据集列表『An Index for Medical Imaging Datasets』

2,773 (+77)

Zjh-819/LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

2,716 (+72)

mit

joke2k/faker

Faker is a Python package that generates fake data for you.

17,864 (+72)

mit

Vchitect/VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

652 (+69)

apache-2.0

ashvardanian/StringZilla

Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖

2,303 (+68)

apache-2.0

doccano/doccano

Open source annotation tool for machine learning practitioners.

9,654 (+64)

mit

Charmve/Surface-Defect-Detection

📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.

3,255 (+59)

mit

WayneMao/RoboMatrix

The Official Implementation of RoboMatrix

60 (+58)

mit

zalandoresearch/fashion-mnist

A MNIST-like fashion product database. Benchmark :point_down:

12,029 (+52)

mit

Last month (relative gain)

cruiseresearchgroup/DIEF_BTS

27 (+108%)

mit

Factor-Robotics/Roller-Coaster-SLAM-Dataset

The world's first roller coaster SLAM dataset

107 (+60%)

cc0-1.0

HannahKirk/prism-alignment

The Prism Alignment Project

58 (+53%)

omegalabsinc/omegalabs-bittensor-subnet

The World's Largest Decentralized AGI Multimodal Dataset

39 (+30%)

mit

nalinrajendran/synthetic-LLM-QA-dataset-generator

Create synthetic datasets for training and testing Language Learning Models (LLMs) in a Question-Answering (QA) context.

25 (+25%)

FerretAngel/labelImage

一个在线图片数据标注的网页工具

31 (+24%)

apache-2.0

whq-xxh/SFADA-GTV-Seg

(TMI-2024) Source-Free Active Domain Adaptation (SFADA) for GTV Segmentation across Multiple Hospitals

30 (+20%)

mit

CHAOZHAO-1/HUSTbearing-dataset

This reposotory release a bearing failure dataset, which can support intelliegnt fault diagnosis research（实验室自采轴承开源数据集，包含稳定转速和时变转速）

43 (+19%)

arbaev/russia-cities

Russia cities and regions: a lot of data in JSON format

25 (+19%)

chenjingong/DN-ReID

[CVPR2024]Day-Night Cross-domain Vehicle Re-identification

27 (+17%)

Sunny5250/Awesome-Multi-Setting-UIAD

A taxonomy of industrial anomaly detection methods and datasets (updating).

72 (+16%)

sled-group/3D-GRAND

Official Implementation of 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs

36 (+16%)

CHAOZHAO-1/Awsome-Multi-modal-based-PHM

Awsome-Multi-modal-based PHM （基于多模态的故障诊断和预测）

38 (+15%)

MWiechmann/enron_spam_data

The Enron-Spam dataset preprocessed in a single, clean csv file.

41 (+14%)

gpl-3.0

thuiar/MIntRec2.0

MIntRec2.0 is the first large-scale dataset for multimodal intent recognition and out-of-scope detection in multi-party conversations (ICLR 2024)

33 (+14%)

Sreyan88/GAMA

Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

92 (+14%)

apache-2.0

HowieHwong/UniGen

UniGen: A Unified Framework for Dataset Generation via Large Language Model

34 (+13%)

zamanzadeh/ts-anomaly-benchmark

Time-Series Anomaly Detection Comprehensive Benchmark

143 (+13%)

mit

neoneye/arc-dataset-collection

Multiple datasets for ARC (Abstraction and Reasoning Corpus)

45 (+13%)

Vchitect/VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

652 (+12%)

apache-2.0

Last 12-months (new repositories)

google-deepmind/long-form-factuality

Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".

565

magpie-align/magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!

535

mit

HowieHwong/TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

493

mit

lzz19980125/awesome-time-series-segmentation-papers

This repository contains a reading list of papers on Time Series Segmentation. This repository is still being continuously improved.

442

gpl-3.0

GTSinger/GTSinger

Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks

246

TechForPalestine/palestine-datasets

The human toll of Israel's ongoing genocide in names & numbers. Use the data from our APIs to tell their story.

225

google/imageinwords

Data release for the ImageInWords (IIW) paper.

203

PKU-YuanGroup/ChronoMagic-Bench

[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation

190

apache-2.0

zjunlp/IEPile

[ACL 2024] IEPile: A Large-Scale Information Extraction Corpus

177

ESA-PhiLab/Major-TOM

Expandable Datasets for Earth Observation

159

IAAR-Shanghai/xFinder

xFinder: Robust and Pinpoint Answer Extraction for Large Language Models

154

RingBDStack/SocialED

A python library for social event detection

146

bsd-2-clause

HPMLL/BurstGPT

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

139

cc-by-4.0

amaralibey/OpenVPRLab

An Open-source Deep Learning Framework for Visual Place Recognition

117

mit

LLM-Red-Team/emo-visual-data

😜 表情包视觉数据集，使用glm-4v、step-1v的图像解析能力标注。

104

neoneye/ARC-Interactive-History-Dataset

The history files when recording human interaction while solving ARC tasks

mit

Sreyan88/GAMA

Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

apache-2.0

wildavatar/WildAvatar_Toolbox

[ArXiv 2024] WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation

tomandjerry136/macdata

MAC Address Database

CAS-SIAT-XinHai/CPsyCoun

[ACL 2024] CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling

cc-by-4.0

Last 12-months (absolute gain)

public-apis/public-apis

A collective list of free APIs

320,395 (+48,289)

mit

HumanSignal/label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

19,867 (+4,778)

apache-2.0

lukas-blecher/LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

13,116 (+4,126)

mit

modelscope/data-juicer

Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

3,164 (+2,433)

apache-2.0

cvat-ai/cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

12,844 (+2,225)

mit

satellite-image-deep-learning/techniques

Techniques for deep learning with satellite & aerial imagery

8,823 (+1,596)

apache-2.0

hyunwoongko/transformer

Transformer: PyTorch Implementation of "Attention Is All You Need"

3,175 (+1,558)

Zjh-819/LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

2,716 (+1,526)

mit

lonePatient/awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合

4,996 (+1,428)

mit

ashvardanian/StringZilla

Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖

2,303 (+1,396)

apache-2.0

joke2k/faker

Faker is a Python package that generates fake data for you.

17,864 (+1,209)

mit

M-3LAB/awesome-industrial-anomaly-detection

Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。

1,730 (+1,158)

doccano/doccano

Open source annotation tool for machine learning practitioners.

9,654 (+1,070)

mit

linhandev/dataset

医学影像数据集列表『An Index for Medical Imaging Datasets』

2,773 (+935)

rom1504/img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

3,806 (+924)

mit

luban-agi/Awesome-Domain-LLM

收集和梳理垂直领域的开源模型、数据集及评测基准。

2,307 (+817)

mit

Charmve/Surface-Defect-Detection

📈 目前最大的工业缺陷检测数据库及论文集 Constantly summarizing open source dataset and critical papers in the field of surface defect research which are of great importance.

3,255 (+792)

mit

zalandoresearch/fashion-mnist

A MNIST-like fashion product database. Benchmark :point_down:

12,029 (+758)

mit

NirantK/awesome-project-ideas

Curated list of Machine Learning, NLP, Vision, Recommender Systems Project Ideas

7,964 (+680)

mit

brightmart/nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

9,540 (+665)

mit

Last 12-months (relative gain)

magpie-align/magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!

535 (+8,817%)

mit

GTSinger/GTSinger

Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks

246 (+3,414%)

longyuewangdcu/GuoFeng-Webnovel

Multilingual Corpus of Web Fiction

220 (+2,650%)

bytedance/Shot2Story

A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.

104 (+2,500%)

HPMLL/BurstGPT

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

139 (+2,217%)

cc-by-4.0

MrGiovanni/SuPreM

[ICLR 2024 Oral] Supervised Pre-Trained 3D Models for Medical Image Analysis (9,262 CT volumes + 25 annotated classes)

269 (+1,821%)

CHAOZHAO-1/DG-PHM

This is a reposotory that includes paper、code and datasets about domain generalization-based fault diagnosis and prognosis. (基于领域泛化的故障诊断和预测)

208 (+1,791%)

Vinyzu/chrome-fingerprints

A Collection of 10.000 collected Windows Chrome Fingerprints. Usable with an easy-to-use API, available as a compressed (lzma) or full-size Json (view Releases). Its just 1.4mb in size in compressed f...

186 (+1,760%)

gpl-3.0