4 results found Sort:
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
benchmark
multimodal
video-clip
video-data
video-dataset
self-supervised
video-retrieval
foundation-models
action-recognition
instruction-tuning
masked-autoencoder
vision-transformer
video-understanding
zero-shot-retrieval
contrastive-learning
open-set-recognition
video-question-answering
zero-shot-classification
temporal-action-localization
spatio-temporal-action-localization
Created
2022-11-23
237 commits to main branch, last one 7 days ago
Spatio-Temporal Action Localization System
Created
2020-03-17
11 commits to master branch, last one 2 years ago
[CVPR 2021] Actor-Context-Actor Relation Network for Spatio-temporal Action Localization
Created
2020-06-12
35 commits to master branch, last one 3 years ago
You Only Watch One Frame for Online Spatio-Temporal Action Detection
Created
2022-04-25
728 commits to main branch, last one about a year ago