Search Results - RepositoryStats

55

931

mit

21

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

dac codec gpt4o encodec acoustic semantic soundstream text-to-speech audio-representation speech-language-model speech-representation music-representation-learning

Created 2024-08-29

30 commits to main branch, last one 14 hours ago

31

378

mit

16

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

tts codec encodec voicecloning speech-to-text audio-generation speech-synthesis audio-quantization

Created 2023-10-07

71 commits to master branch, last one 12 months ago

13

237

unknown

7

A Survey of Spoken Dialogue Models (60 pages)

moshi duplex gpt-4o speech encodec mini-omni streaming llama-omni intreaction wavtokenizer modal-alignment speech-language-model speech-representation spoken-dialogue-models

Created 2024-11-11

32 commits to main branch, last one about a month ago