Search Results - RepositoryStats

2 results found Sort:

266

2.9k

bsd-3-clause

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

blip2 llama minigpt4 multi-modal-chatgpt large-language-models cross-modal-pretraining video-language-pretraining vision-language-pretraining

Created 2023-05-06

145 commits to main branch, last one 8 months ago

RLIP JacobYuan7

apache-2.0

[NeurIPS 2022 Spotlight] RLIP: Relational Language-Image Pre-training and a series of other methods to solve HOI detection and Scene Graph Generation.

relation hoi-detection detection-model relation-detection cross-modal-pretraining

Created 2022-08-31

98 commits to main branch, last one 8 months ago