Search Results - RepositoryStats

InternVideo OpenGVLab

105

1.8k

apache-2.0

27

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Created 2022-11-23

245 commits to main branch, last one 26 days ago

ClipBERT jayleicn

86

717

mit

8

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.

vqa pytorch cvpr2021 video-retrieval vision-and-language video-question-answering

Created 2021-02-10

14 commits to main branch, last one 2 years ago

MiniGPT4-video Vision-CAIR

67

603

bsd-3-clause

12

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

video-retrieval video-understanding long-video-understanding video-question-answering

Created 2024-03-23

190 commits to main branch, last one 3 months ago

collaborative-experts albanie

54

339

apache-2.0

9

Video embeddings for retrieval with natural language queries

video-retrieval deep-neural-networks

Created 2019-07-17

90 commits to master branch, last one 2 years ago

Youku-mPLUG X-PLUG

11

294

apache-2.0

6

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks

mllm video youku chinese dataset benchmark multimodal video-retrieval multimodal-pretraining video-question-answering multimodal-large-language-models

Created 2023-06-06

18 commits to main branch, last one about a year ago

moment_detr jayleicn

49

292

mit

8

[NeurIPS 2021] Moment-DETR code and QVHighlights dataset

pytorch video-retrieval

Created 2021-07-20

9 commits to main branch, last one 2 years ago

QD-DETR wjun0830

16

224

other

3

Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 Paper)

multi-modal deep-learning computer-vision video-retrieval moment-retrieval video-summarization text-video-retrieval detection-transformer video-highlight-detection

Created 2023-01-30

31 commits to main branch, last one about a year ago

mPLUG-2 X-PLUG

19

223

apache-2.0

4

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)

vqa mllm mplug video multimodal image-retrieval video-retrieval foundation-models multimodal-pretraining video-question-answering

Created 2023-05-22

4 commits to main branch, last one about a year ago

visil MKLab-ITI

39

215

apache-2.0

8

Authors official PyTorch implementation of the "ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning" [ICCV 2019]

fivr ndvr video-search near-duplicates video-retrieval duplicate-videos video-similarity-search video-similarity-learning near-duplicate-video-retrieval

Created 2019-08-14

38 commits to master branch, last one about a year ago

TVRetrieval jayleicn

24

157

mit

7

[ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval

tvc tvr dataset pytorch video-retrieval

Created 2020-01-27

17 commits to master branch, last one 10 months ago

pytorch_violet tsujuifu

7

137

unknown

8

A PyTorch implementation of VIOLET

pytorch pre-training video-retrieval vision-and-language video-question-answering

Created 2021-11-24

47 commits to main branch, last one about a year ago

EMCL jpthu17

9

130

mit

2

[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

neurips video-retrieval video-captioning cross-modal-retrieval video-question-answering

Created 2022-09-23

33 commits to main branch, last one 11 months ago

DiffusionRet jpthu17

7

129

apache-2.0

2

[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

iccv2023 video-retrieval diffusion-models cross-modal-retrieval

Created 2023-03-16

20 commits to main branch, last one 11 months ago

ndvr-dml MKLab-ITI

18

119

apache-2.0

5

Authors official Tensorflow implementation of the "Near-Duplicate Video Retrieval with Deep Metric Learning" [ICCVW 2017]

dml ndvr video-retrieval deep-metric-learning near-duplicate-video-retrieval

Created 2018-09-13

29 commits to master branch, last one about a year ago

HBI jpthu17

5

112

apache-2.0

3

[CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning

cvpr video-retrieval cross-modal-retrieval video-question-answering

Created 2023-02-28

35 commits to main branch, last one 2 months ago

HiREST j-min

9

100

mit

5

Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)

hirest step-captioning video-retrieval moment-retrieval moment-segmentation vision-and-language

Created 2023-03-24

12 commits to main branch, last one 2 months ago

DRL foolwood

5

96

apache-2.0

3

[arXiv22] Disentangled Representation Learning for Text-Video Retrieval

clip transformer video-retrieval interaction-nets text-video-search-engine

Created 2022-04-07

4 commits to main branch, last one 2 years ago

distill-and-select mever-team

9

66

apache-2.0

9

Authors official PyTorch implementation of the "DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval" [IJCV 2022]

fivr ndvr video-search video-retrieval duplicate-videos knowledge-distillation video-similarity-search video-similarity-learning near-duplicate-video-retrieval

Created 2021-06-24

42 commits to main branch, last one about a year ago

TransVCL transvcl

6

54

mit

3

TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]

video-retrieval temporal-alignment video-copy-detection

Created 2022-08-08

7 commits to main branch, last one 2 years ago

DiCoSA jpthu17

2

51

apache-2.0

2

[IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment

ijcai video-retrieval cross-modal-retrieval

Created 2023-04-29

15 commits to main branch, last one 11 months ago

PKOL zchoi

0

46

mit

1

[TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”

pytorch video-retrieval vision-language pytorch-implementation video-question-answering

Created 2022-01-24

43 commits to main branch, last one about a year ago

COSA TXH-mercury

3

43

mit

2

[ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

video-qa video-retrieval video-captioning video-language-pretrainng vision-language-pretraining

Created 2023-05-24

9 commits to master branch, last one 3 months ago

s2vs gkordo

2

41

mit

1

Authors official PyTorch implementation of the "Self-Supervised Video Similarity Learning" [CVPRW 2023]

fivr ndvr video-search video-detection video-retrieval duplicate-videos self-supervision video-similarity video-similarity-search self-supervised-learning video-similarity-learning

Created 2023-04-05

21 commits to main branch, last one about a year ago

pytorch_empirical-mvm tsujuifu

2

40

unknown

2

A PyTorch implementation of EmpiricalMVM

pytorch cvpr2023 pre-training video-retrieval video-captioning vision-and-language video-question-answering

Created 2023-03-09

9 commits to main branch, last one about a year ago

MMC-PCFG Sy-Zhang

4

40

mit

1

Video-aided Unsupervised Grammar Induction, NAACL‘21 [best long paper]

video-retrieval grammar-induction

Created 2021-04-09

18 commits to master branch, last one 2 years ago

ViCC martinetoering

8

36

other

2

[WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https://arxiv.org/abs/2106.10137.

video-retrieval video-recognition action-recognition contrastive-learning unsupervised-learning self-supervised-learning

Created 2021-06-16

17 commits to main branch, last one 2 years ago

awesome-video-text-datasets willyfh

3

36

mit

2

A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.

dataset video-text video-to-text video-language video-retrieval vision-language video-captioning video-description

Created 2023-01-03

26 commits to main branch, last one about a year ago

MELTR mlvlab

7

33

mit

7

MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models (CVPR 2023)

cvpr2023 multi-modal meta-learning video-retrieval video-captioning video-question-answering

Created 2023-03-23

9 commits to master branch, last one 11 months ago

TextVR callsys

0

25

mit

2

[PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension

text video-retrieval

Created 2023-05-03

38 commits to main branch, last one about a year ago