6 results found Sort:
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Created
2024-09-10
13 commits to main branch, last one 9 days ago
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Created
2024-08-29
23 commits to main branch, last one 19 hours ago
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Created
2024-05-26
28 commits to main branch, last one about a month ago
A Survey of Spoken Dialogue Models (60 pages)
Created
2024-11-11
21 commits to main branch, last one 19 hours ago
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
Created
2024-09-02
8 commits to master branch, last one 2 months ago
The official code for the SALMon🍣 benchmark
Created
2024-04-29
17 commits to main branch, last one 2 months ago