4 results found Sort:

70
1.0k
apache-2.0
14
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Created 2023-05-18
136 commits to main branch, last one 5 months ago
Audio Large Language Models
Created 2024-06-15
98 commits to main branch, last one 15 days ago
17
272
mit
16
[NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Created 2023-05-29
13 commits to master branch, last one about a year ago
11
118
apache-2.0
9
Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities
Created 2024-06-15
28 commits to main branch, last one 3 months ago