Search Results - RepositoryStats

195

2.9k

apache-2.0

32

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

speech-to-text speech-to-speech speech-interaction large-language-models speech-language-model multimodal-large-language-models

Created 2024-09-10

13 commits to main branch, last one 5 months ago

WavTokenizer jishengpeng

87

1.1k

mit

25

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

dac codec gpt4o encodec acoustic semantic soundstream text-to-speech audio-representation speech-language-model speech-representation music-representation-learning

Created 2024-08-29

32 commits to main branch, last one about a month ago

WavChat jishengpeng

16

288

unknown

8

A Survey of Spoken Dialogue Models (60 pages)

moshi duplex gpt-4o speech encodec mini-omni streaming llama-omni intreaction wavtokenizer modal-alignment speech-language-model speech-representation spoken-dialogue-models

Created 2024-11-11

32 commits to main branch, last one 4 months ago

slamkit slp-rl

8

203

mit

8

SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on One GPU in a Day"

transformers language-model efficient-training speech-language-model

Created 2025-02-18

46 commits to main branch, last one 13 days ago

xcodec zhenye234

12

198

unknown

7

AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

gpt audio codec music sound speech vall-e semantic tokenizer audio-codec text-to-music text-to-sound language-model text-to-speech speech-language-model self-supervised-learning

Created 2024-05-26

33 commits to main branch, last one 19 days ago

SoCodec hhguo

5

80

mit

8

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

tts audio speech speech-codec speech-language-model

Created 2024-09-02

10 commits to master branch, last one 3 months ago

DeSTA2 kehanlu

5

79

other

5

Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"

speech-processing large-language-models speech-language-model

Created 2024-09-13

11 commits to main branch, last one 3 months ago

salmon slp-rl

0

45

unknown

1

The official code for the SALMon🍣 benchmark (ICASSP 2025 - Oral)

acoustic-model audio-processing speech-language-model

Created 2024-04-29

31 commits to main branch, last one 6 days ago