Search Results - RepositoryStats

196

2.9k

apache-2.0

30

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

speech-to-text speech-to-speech speech-interaction large-language-models speech-language-model multimodal-large-language-models

Created 2024-09-10

13 commits to main branch, last one 4 months ago

Applio IAHispano

355

2.2k

mit

32

A simple, high-quality voice conversion tool focused on ease of use and performance.

ai vc rvc tts vits voice applio speech pytorch voice-clone voice-cloning text-to-speech speech-to-speech voice-conversion

Created 2023-08-07

3,473 commits to main branch, last one 12 hours ago

CleanS2S opendilab

35

369

apache-2.0

6

High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体！

ai gpt-4o python streaming machine-learning speech-synthesis speech-to-speech speech-recognition

Created 2024-09-26

49 commits to main branch, last one 13 days ago

Freeze-Omni VITA-MLLM

19

289

other

10

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

speech speech-synthesis speech-to-speech speech-recognition large-language-models multimodal-large-language-models

Created 2024-11-04

21 commits to main branch, last one 2 months ago

real-time-voice-translator SamirPaulb

62

248

gpl-2.0

4

A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.

Created 2023-08-16

36 commits to main branch, last one about a year ago

MooER MooreThreads

15

197

other

11

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not lim...

gpt-4o chatgpt speech-to-text speech-to-speech speech-interaction speech-recognition speech-translation large-language-models

Created 2024-08-12

54 commits to master branch, last one 2 months ago

weebo amanvirparhar

33

184

mit

2

A real-time speech-to-speech chatbot powered by Whisper Small, Llama 3.2, and Kokoro-82M.

llama kokoro whisper speech-to-speech

Created 2025-01-16

9 commits to main branch, last one about a month ago

awesome-speech-translation dqqcasia

1

177

unknown

13

This repository has no description...

speech speech-synthesis speech-to-speech text-translation speech-processing speech-recognition speech-translation machine-translation speech-to-subtitles disfluency-detection punctuation-restoration simultaneous-translation cascaded-speech-translation multimodal-machine-learning natural-language-processing multimodal-machine-translation non-autoregressive-translation

Created 2019-09-18

155 commits to master branch, last one 3 years ago

samantha-os1 jesuscopado

75

135

mit

8

Samantha OS1 is a conversational AI assistant powered by the Realtime API from OpenAI

agent openai ai-agent realtime-api speech-to-speech

Created 2024-10-17

13 commits to main branch, last one 2 months ago

On-Device-Speech-to-Speech-Conversational-AI asiff00

7

73

mit

7

This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming architecture for fluid conversations with immediate responses and na...

asr tts vad ollama kokoro-tts voice-assistant audio-processing speech-to-speech conversational-ai

Created 2025-01-02

79 commits to main branch, last one 29 days ago

UltraEval-Audio OpenBMB

1

65

apache-2.0

7

An easy-to-use, fast, and easily integrable tool for evaluating audio LLM

evaluation speech-to-text speech-to-speech speech-recognition

Created 2024-11-11

78 commits to main branch, last one 2 hours ago

svelte-openai-realtime-api flo-bit

9

61

mit

2

svelte component for using the openai realtime api

openai svelte sveltekit realtime-api speech-to-speech

Created 2024-10-04

18 commits to main branch, last one 2 months ago

DASpeech ictnlp

5

61

unknown

4

Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".

speech-to-speech speech-translation machine-translation speech-to-speech-translation

Created 2023-10-07

22 commits to main branch, last one 7 months ago

rtvc hparcells

6

53

gpl-3.0

3

💬 "Realtime" voice transcription and cloning using ElevenLabs's API.

ai api web website elevenlabs interactive voicecloning transcription voice-cloning voice-synthesis speech-to-speech

Created 2023-02-22

17 commits to master branch, last one 2 years ago

speech-to-speech liamdugan

6

30

unknown

2

Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"

speech speech-to-speech speech-processing speech-translation simultaneous-translation

Created 2023-01-31

54 commits to main branch, last one 2 months ago

Echo-XI lugia19

2

28

unknown

3

Speech to text to speech using Elevenlabs

tts voice python speech elevenlabs speech-to-text speech-synthesis speech-to-speech speech-recognition

Created 2023-02-05

110 commits to master branch, last one about a year ago