82 results found Sort:
- Filter by Primary Language:
- Python (70)
- Jupyter Notebook (4)
- C++ (1)
- TypeScript (1)
- +
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Created
2020-07-11
2,061 commits to main branch, last one about a year ago
A curated list of action recognition and related area resources
Created
2016-09-22
296 commits to master branch, last one about a year ago
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Created
2023-04-19
207 commits to main branch, last one about a month ago
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
Created
2019-03-27
48 commits to master branch, last one 8 months ago
An open-source toolbox for action understanding based on PyTorch
Created
2019-06-13
179 commits to master branch, last one 4 years ago
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
benchmark
multimodal
video-clip
video-data
video-dataset
self-supervised
video-retrieval
foundation-models
action-recognition
instruction-tuning
masked-autoencoder
vision-transformer
video-understanding
zero-shot-retrieval
contrastive-learning
open-set-recognition
video-question-answering
zero-shot-classification
temporal-action-localization
spatio-temporal-action-localization
Created
2022-11-23
245 commits to main branch, last one 11 days ago
Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video taggi...
Created
2020-11-12
2,918 commits to develop branch, last one 27 days ago
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
Created
2016-07-14
129 commits to master branch, last one 5 years ago
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Created
2022-03-23
64 commits to main branch, last one about a year ago
Temporal Segment Networks (TSN) in PyTorch
Created
2017-08-10
31 commits to master branch, last one 5 years ago
awesome grounding: A curated list of research papers in visual grounding
Created
2018-09-03
97 commits to master branch, last one about a year ago
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Created
2023-11-13
83 commits to main branch, last one 4 months ago
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
Created
2024-03-23
190 commits to main branch, last one 3 months ago
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Created
2023-04-04
21 commits to master branch, last one 5 months ago
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Created
2023-08-01
11 commits to main branch, last one about a year ago
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Created
2020-12-17
27 commits to main branch, last one 2 years ago
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
Created
2021-10-17
24 commits to main branch, last one 8 months ago
Tools for movie and video research
Created
2019-06-05
91 commits to master branch, last one 2 years ago
[CVPRW'24] SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap (CVPR24 - CVSports workshop)
Created
2024-02-05
263 commits to main branch, last one 29 days ago
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Created
2024-03-26
18 commits to main branch, last one 7 months ago
deep learning sex position classifier
Created
2022-02-05
113 commits to master branch, last one about a year ago
Dataset, code and model for the CVPR'20 paper "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction". And for the ECCV'20 SimAug paper.
Created
2019-12-22
87 commits to master branch, last one 2 years ago
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral
Created
2022-03-21
19 commits to main branch, last one 2 years ago
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Created
2023-01-07
32 commits to main branch, last one 3 months ago
OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
Created
2024-03-28
60 commits to main branch, last one 10 days ago
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, video representation learning and temporal detection.
Created
2021-06-23
27 commits to main branch, last one about a year ago
A Cross-platform AI chat application built with React Native and powered by Amazon Bedrock
Created
2024-11-06
18 commits to main branch, last one 9 days ago
【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
Created
2022-06-15
38 commits to main branch, last one 9 months ago
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
Created
2023-06-06
59 commits to main branch, last one about a year ago
🔥 🔥 🔥 [NeurIPS 2024] Hawk: Learning to Understand Open-World Video Anomalies
Created
2024-05-23
18 commits to main branch, last one 13 days ago