3 results found Sort:
[CVPR2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
Created
2021-03-27
39 commits to master branch, last one 5 months ago
[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Created
2023-10-29
9 commits to main branch, last one about a year ago
[ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
Created
2023-05-24
9 commits to master branch, last one about a month ago