4 results found Sort:
- Filter by Primary Language:
- Jupyter Notebook (2)
- Python (2)
- +
[NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Created
2023-05-29
13 commits to master branch, last one 11 months ago
We introduce temporal working memory (TWM), which aims to enhance the temporal modeling capabilities of Multimodal foundation models (MFMs). This plug-and-play module can be easily integrated into exi...
Created
2025-01-23
27 commits to main branch, last one 29 days ago
MADELEINE: multi-stain slide representation learning (ECCV'24)
Created
2024-07-16
45 commits to main branch, last one 5 days ago
Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"
Created
2024-06-11
29 commits to main branch, last one 3 months ago