73 results found Sort:
- Filter by Primary Language:
- Python (63)
- Jupyter Notebook (4)
- C++ (1)
- +
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Created
2020-07-11
2,061 commits to main branch, last one about a year ago
A curated list of action recognition and related area resources
Created
2016-09-22
296 commits to master branch, last one about a year ago
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Created
2023-04-17
357 commits to main branch, last one 3 days ago
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Created
2023-04-19
204 commits to main branch, last one 2 months ago
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
Created
2019-03-27
48 commits to master branch, last one 4 months ago
An open-source toolbox for action understanding based on PyTorch
Created
2019-06-13
179 commits to master branch, last one 4 years ago
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
Created
2016-07-14
129 commits to master branch, last one 5 years ago
Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video taggi...
Created
2020-11-12
2,906 commits to develop branch, last one a day ago
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
benchmark
multimodal
video-clip
video-data
video-dataset
self-supervised
video-retrieval
foundation-models
action-recognition
instruction-tuning
masked-autoencoder
vision-transformer
video-understanding
zero-shot-retrieval
contrastive-learning
open-set-recognition
video-question-answering
zero-shot-classification
temporal-action-localization
spatio-temporal-action-localization
Created
2022-11-23
213 commits to main branch, last one 4 days ago
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Created
2022-03-23
64 commits to main branch, last one about a year ago
Temporal Segment Networks (TSN) in PyTorch
Created
2017-08-10
31 commits to master branch, last one 5 years ago
awesome grounding: A curated list of research papers in visual grounding
Created
2018-09-03
97 commits to master branch, last one about a year ago
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Created
2023-11-13
83 commits to main branch, last one about a month ago
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
Created
2024-03-23
189 commits to main branch, last one 3 months ago
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Created
2023-04-04
21 commits to master branch, last one about a month ago
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Created
2023-08-01
11 commits to main branch, last one about a year ago
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Created
2020-12-17
27 commits to main branch, last one 2 years ago
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
Created
2021-10-17
24 commits to main branch, last one 4 months ago
Tools for movie and video research
Created
2019-06-05
91 commits to master branch, last one 2 years ago
Dataset, code and model for the CVPR'20 paper "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction". And for the ECCV'20 SimAug paper.
Created
2019-12-22
87 commits to master branch, last one about a year ago
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Created
2024-03-26
18 commits to main branch, last one 4 months ago
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Created
2023-01-07
31 commits to main branch, last one 2 months ago
[CVPRW'24] SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap (CVPR24 - CVSports workshop)
Created
2024-02-05
235 commits to main branch, last one 12 days ago
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral
Created
2022-03-21
19 commits to main branch, last one about a year ago
deep learning sex position classifier
Created
2022-02-05
113 commits to master branch, last one about a year ago
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, video representation learning and temporal detection.
Created
2021-06-23
27 commits to main branch, last one about a year ago
【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
Created
2022-06-15
38 commits to main branch, last one 5 months ago
OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
Created
2024-03-28
54 commits to main branch, last one about a month ago
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
Created
2023-06-06
59 commits to main branch, last one about a year ago
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
Created
2022-03-19
18 commits to main branch, last one about a year ago