4 results found Sort:

63
964
apache-2.0
14
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Created 2023-05-18
136 commits to main branch, last one about a month ago
16
238
mit
18
Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Created 2023-05-29
13 commits to master branch, last one 7 months ago
7
121
unknown
6
Audio Large Language Models
Created 2024-06-15
51 commits to main branch, last one 13 days ago
7
77
apache-2.0
7
Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities
Created 2024-06-15
24 commits to main branch, last one 3 months ago