3 results found Sort:
[CVPR2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
Created
2021-03-27
39 commits to master branch, last one 2 months ago
[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Created
2023-10-29
9 commits to main branch, last one 10 months ago
Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Created
2023-05-24
8 commits to master branch, last one about a year ago