8 results found Sort:

194
2.8k
apache-2.0
30
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Created 2024-09-10
13 commits to main branch, last one 3 months ago
[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling
Created 2024-08-29
32 commits to main branch, last one 8 days ago
16
271
unknown
8
A Survey of Spoken Dialogue Models (60 pages)
Created 2024-11-11
32 commits to main branch, last one 3 months ago
11
177
unknown
7
AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Created 2024-05-26
31 commits to main branch, last one 2 months ago
SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on One GPU in a Day"
Created 2025-02-18
15 commits to main branch, last one 5 days ago
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
Created 2024-09-02
10 commits to master branch, last one 2 months ago
4
69
other
5
Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"
Created 2024-09-13
11 commits to main branch, last one about a month ago
0
44
unknown
1
The official code for the SALMonšŸ£ benchmark (ICASSP 2025 - Oral)
Created 2024-04-29
29 commits to main branch, last one a day ago