Trending repositories for topic video-classification
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Make video classification on UCF101 using CNN and RNN based on Pytorch framework.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
Make video classification on UCF101 using CNN and RNN based on Pytorch framework.
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Papers, code and datasets about deep learning and multi-modal learning for video analysis
Video classification tools using 3D ResNet
SoccerAct10 is a dataset which contains 10 different soccer actions. This dataset was developed using the videos from YouTube.
Make video classification on UCF101 using CNN and RNN based on Pytorch framework.
Easiest way of fine-tuning HuggingFace video classification models
Simplest and fastest image and text annotation tool.
Deepfakes Video classification via CNN, LSTM, C3D and triplets [IWBF'20]
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
CricShot10 is a video action recognition dataset consisting of 10 cricket batting shots. This dataset was developed using the videos from YouTube.
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
Make video classification on UCF101 using CNN and RNN based on Pytorch framework.
SoccerAct10 is a dataset which contains 10 different soccer actions. This dataset was developed using the videos from YouTube.
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Deepfakes Video classification via CNN, LSTM, C3D and triplets [IWBF'20]
Easiest way of fine-tuning HuggingFace video classification models
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
CricShot10 is a video action recognition dataset consisting of 10 cricket batting shots. This dataset was developed using the videos from YouTube.
Simplest and fastest image and text annotation tool.
Papers, code and datasets about deep learning and multi-modal learning for video analysis
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Video classification tools using 3D ResNet
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Papers, code and datasets about deep learning and multi-modal learning for video analysis
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, video representation learning and temporal detection.
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
Video classification tools using 3D ResNet
Easiest way of fine-tuning HuggingFace video classification models
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
CricShot10 is a video action recognition dataset consisting of 10 cricket batting shots. This dataset was developed using the videos from YouTube.
Make video classification on UCF101 using CNN and RNN based on Pytorch framework.
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
SoccerAct10 is a dataset which contains 10 different soccer actions. This dataset was developed using the videos from YouTube.
[Neurocomputing 2019] Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion
Simplest and fastest image and text annotation tool.
Deepfakes Video classification via CNN, LSTM, C3D and triplets [IWBF'20]
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
Make video classification on UCF101 using CNN and RNN based on Pytorch framework.
Easiest way of fine-tuning HuggingFace video classification models
CricShot10 is a video action recognition dataset consisting of 10 cricket batting shots. This dataset was developed using the videos from YouTube.
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, video representation learning and temporal detection.
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
SoccerAct10 is a dataset which contains 10 different soccer actions. This dataset was developed using the videos from YouTube.
Deepfakes Video classification via CNN, LSTM, C3D and triplets [IWBF'20]
The notebook explains the various steps to obtain the results of publication: "Is Space-Time Attention All You Need for Video Understanding?"
[Neurocomputing 2019] Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion
Papers, code and datasets about deep learning and multi-modal learning for video analysis
Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
3D ResNet Video Classification accelerated by TensorRT
Simplest and fastest image and text annotation tool.
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification