2 results found Sort:

257
2.8k
bsd-3-clause
32
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Created 2023-05-06
145 commits to main branch, last one 5 months ago
3
71
apache-2.0
4
[NeurIPS 2022 Spotlight] RLIP: Relational Language-Image Pre-training and a series of other methods to solve HOI detection and Scene Graph Generation.
Created 2022-08-31
98 commits to main branch, last one 5 months ago