Trending repositories for topic text-to-speech

Last 3 days (new repositories)

no newly created repositories trending in the last 3 days

Last 3 days (absolute gain)

RVC-Boss/GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

38,127 (+390)

mit

FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

9,131 (+115)

apache-2.0

2noise/ChatTTS

A generative speech model for daily dialogue.

33,380 (+61)

agpl-3.0

coqui-ai/TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

36,547 (+59)

mpl-2.0

leon-ai/leon

🧠 Leon is your open-source personal assistant.

15,675 (+51)

mit

myshell-ai/OpenVoice

Instant voice cloning by MIT and MyShell.

30,309 (+44)

mit

jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

11,394 (+35)

gpl-3.0

rhasspy/piper

A fast, local neural text to speech system

7,238 (+33)

mit

k2-fsa/sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V,...

3,994 (+24)

apache-2.0

myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

5,117 (+22)

mit

rany2/edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

6,706 (+18)

lgpl-3.0

IAHispano/Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

1,940 (+16)

mit

open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...

7,999 (+16)

mit

yl4579/StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

5,094 (+13)

mit

espeak-ng/espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

4,378 (+13)

gpl-3.0

abus-aikorea/voice-pro

Comprehensive Gradio WebUI for audio processing, powered by Whisper engines (Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer, zero-shot Voice Cloning (E2, F5-TTS), YouTube downlo...

2,459 (+13)

mit

elevenlabs/elevenlabs-python

The official Python API for ElevenLabs Text to Speech.

2,315 (+10)

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

945 (+9)

agpl-3.0

promptslab/Awesome-Prompt-Engineering

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

4,047 (+9)

apache-2.0

jaywalnut310/vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

7,011 (+8)

mit

Last 3 days (relative gain)

travisvn/obsidian-edge-tts

Free, high quality text-to-speech for your Obsidian notes, leveraging Microsoft Edge's Read Aloud API.

47 (+4%)

gpl-3.0

ashbuilds/payload-ai

AI Plugin is a powerful extension for the Payload CMS, integrating advanced AI capabilities to enhance content creation and management.

98 (+3%)

travisvn/openai-edge-tts

Text-to-speech API endpoint compatible with OpenAI's TTS API endpoint, using Microsoft Edge TTS to generate speech for free locally

210 (+2%)

gpl-3.0

mazzasaverio/youtube-auto-dub

Automated voice dubbing for YouTube videos using Docker, OpenVoice, and FastAPI. Translates and dubs videos with original voice timbre.

42 (+2%)

echogarden-project/echogarden

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice i...

276 (+2%)

gpl-3.0

lucasnewman/f5-tts-mlx

Implementation of F5-TTS in MLX

411 (+2%)

mit

Wikidepia/indonesian-tts

Indonesian TTS (text-to-speech) using Coqui TTS

60 (+2%)

FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

9,131 (+1%)

apache-2.0

RVC-Boss/GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

38,127 (+1%)

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

945 (+1.0%)

agpl-3.0

IAHispano/Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

1,940 (+0.8%)

mit

YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

401 (+0.8%)

shivammehta25/Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

811 (+0.6%)

mit

k2-fsa/sherpa-onnx

3,994 (+0.6%)

apache-2.0

abus-aikorea/voice-pro

2,459 (+0.5%)

mit

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

954 (+0.5%)

apache-2.0

met4citizen/TalkingHead

Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.

387 (+0.5%)

mit

elevenlabs/elevenlabs-python

The official Python API for ElevenLabs Text to Speech.

2,315 (+0.4%)

mit

myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

5,117 (+0.4%)

mit

DigitalPhonetics/IMS-Toucan

Controllable and fast Text-to-Speech for over 7000 languages!

1,511 (+0.4%)

apache-2.0

Last week (new repositories)

no newly created repositories trending in the last week

Last week (absolute gain)

RVC-Boss/GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

38,127 (+670)

mit

FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

9,131 (+223)

apache-2.0

2noise/ChatTTS

A generative speech model for daily dialogue.

33,380 (+142)

agpl-3.0

coqui-ai/TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

36,547 (+137)

mpl-2.0

myshell-ai/OpenVoice

Instant voice cloning by MIT and MyShell.

30,309 (+89)

mit

leon-ai/leon

🧠 Leon is your open-source personal assistant.

15,675 (+82)

mit

jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

11,394 (+71)

gpl-3.0

rhasspy/piper

A fast, local neural text to speech system

7,238 (+66)

mit

k2-fsa/sherpa-onnx

3,994 (+57)

apache-2.0

rany2/edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

6,706 (+56)

lgpl-3.0

abus-aikorea/voice-pro

2,459 (+41)

mit

myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

5,117 (+35)

mit

KoljaB/RealtimeTTS

Converts text to speech in realtime

2,210 (+31)

IAHispano/Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

1,940 (+27)

mit

open-mmlab/Amphion

7,999 (+25)

mit

yl4579/StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

5,094 (+21)

mit

elevenlabs/elevenlabs-python

The official Python API for ElevenLabs Text to Speech.

2,315 (+21)

mit

echogarden-project/echogarden

276 (+19)

gpl-3.0

espnet/espnet

End-to-End Speech Processing Toolkit

8,646 (+19)

apache-2.0

babysor/MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

35,580 (+18)

Last week (relative gain)

travisvn/obsidian-edge-tts

Free, high quality text-to-speech for your Obsidian notes, leveraging Microsoft Edge's Read Aloud API.

47 (+9%)

gpl-3.0

echogarden-project/echogarden

276 (+7%)

gpl-3.0

Wikidepia/indonesian-tts

Indonesian TTS (text-to-speech) using Coqui TTS

60 (+7%)

ashbuilds/payload-ai

AI Plugin is a powerful extension for the Payload CMS, integrating advanced AI capabilities to enhance content creation and management.

98 (+7%)

Mindinventory/AutoHighlightTTS

AutoHighlightTTS is a simple, powerful solution for Android Text to Speech, featuring auto sentence highlighting with custom styles, auto-scrolling text, and adjustable language, pitch, and speech rat...

35 (+6%)

mit

zhenye234/xcodec

AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

121 (+5%)

travisvn/openai-edge-tts

Text-to-speech API endpoint compatible with OpenAI's TTS API endpoint, using Microsoft Edge TTS to generate speech for free locally

210 (+5%)

gpl-3.0

apinge/MeloTTS.cpp

A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting multiple languages.

29 (+4%)

apache-2.0

nicolodiamante/dispatch

Revamp your morning routine and supercharge productivity with Dispatch. The ultimate Apple Shortcut powered by ChatGPT and ElevenLabs.

30 (+3%)

WangHelin1997/SSR-Speech

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis

108 (+3%)

mit

gexgd0419/NaturalVoiceSAPIAdapter

Make Azure natural TTS voices accessible to any SAPI 5-compatible application.

218 (+3%)

mit

lucasnewman/f5-tts-mlx

Implementation of F5-TTS in MLX

411 (+3%)

mit

FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

9,131 (+3%)

apache-2.0

mazzasaverio/youtube-auto-dub

Automated voice dubbing for YouTube videos using Docker, OpenVoice, and FastAPI. Translates and dubs videos with original voice timbre.

42 (+2%)

vilassn/whisper_android

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

287 (+2%)

mit

RVC-Boss/GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

38,127 (+2%)

mit

shivammehta25/Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

811 (+2%)

mit

abus-aikorea/voice-pro

2,459 (+2%)

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

945 (+2%)

agpl-3.0

YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

401 (+2%)

Last month (new repositories)

Mindinventory/AutoHighlightTTS

mit

Last month (absolute gain)

FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

9,131 (+2,474)

apache-2.0

RVC-Boss/GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

38,127 (+1,583)

mit

coqui-ai/TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

36,547 (+646)

mpl-2.0

2noise/ChatTTS

A generative speech model for daily dialogue.

33,380 (+591)

agpl-3.0

jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

11,394 (+429)

gpl-3.0

rhasspy/piper

A fast, local neural text to speech system

7,238 (+428)

mit

myshell-ai/OpenVoice

Instant voice cloning by MIT and MyShell.

30,309 (+327)

mit

abus-aikorea/voice-pro

2,459 (+274)

mit

k2-fsa/sherpa-onnx

3,994 (+223)

apache-2.0

rany2/edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

6,706 (+214)

lgpl-3.0

leon-ai/leon

🧠 Leon is your open-source personal assistant.

15,675 (+155)

mit

myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

5,117 (+149)

mit

open-mmlab/Amphion

7,999 (+141)

mit

babysor/MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

35,580 (+133)

KoljaB/RealtimeTTS

Converts text to speech in realtime

2,210 (+122)

collabora/WhisperLive

A nearly-live implementation of OpenAI's Whisper.

2,248 (+111)

mit

espeak-ng/espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

4,378 (+104)

gpl-3.0

IAHispano/Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

1,940 (+98)

mit

jishengpeng/WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

932 (+96)

mit

promptslab/Awesome-Prompt-Engineering

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

4,047 (+87)

apache-2.0

Last month (relative gain)

travisvn/obsidian-edge-tts

Free, high quality text-to-speech for your Obsidian notes, leveraging Microsoft Edge's Read Aloud API.

47 (+68%)

gpl-3.0

Mindinventory/AutoHighlightTTS

35 (+40%)

mit

FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

9,131 (+37%)

apache-2.0

echogarden-project/echogarden

276 (+33%)

gpl-3.0

travisvn/openai-edge-tts

Text-to-speech API endpoint compatible with OpenAI's TTS API endpoint, using Microsoft Edge TTS to generate speech for free locally

210 (+27%)

gpl-3.0

apinge/MeloTTS.cpp

A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting multiple languages.

29 (+26%)

apache-2.0

iMicknl/azure-podcast-generator

Generate an engaging podcast based on your document using Azure OpenAI and Azure Speech.

26 (+24%)

mit

ashbuilds/payload-ai

AI Plugin is a powerful extension for the Payload CMS, integrating advanced AI capabilities to enhance content creation and management.

98 (+20%)

gexgd0419/NaturalVoiceSAPIAdapter

Make Azure natural TTS voices accessible to any SAPI 5-compatible application.

218 (+16%)

mit

elevenlabs/elevenlabs-examples

No description

220 (+15%)

mit

abus-aikorea/voice-pro

2,459 (+13%)

mit

zhenye234/xcodec

AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

121 (+12%)

Aivis-Project/AivisSpeech-Engine

AivisSpeech Engine: AI Voice Imitation System - Text to Speech Engine

76 (+12%)

lgpl-3.0

jishengpeng/WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

932 (+11%)

mit

azkadev/piper

Wip From NTTS Piper Ultra Fast And Efficient Text To Speech Library For Cross Platform Work on any edge device with cpu only

110 (+11%)

Aivis-Project/AivisSpeech

AivisSpeech: AI Voice Imitation System - Text to Speech Software

297 (+11%)

lgpl-3.0

pnkvalavala/digitaltwin

Using a single image and just 10 seconds of sample audio, our project enables you to create a video where it appears as if you're speaking the desired text.

31 (+11%)

Wikidepia/indonesian-tts

Indonesian TTS (text-to-speech) using Coqui TTS

60 (+9%)

edwko/OuteTTS

Interface for OuteTTS models.

792 (+9%)

apache-2.0

e-c-k-e-r/vall-e

An unofficial PyTorch implementation of VALL-E

87 (+9%)

agpl-3.0

Last 12-months (new repositories)

RVC-Boss/GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

38,127

mit

2noise/ChatTTS

A generative speech model for daily dialogue.

33,380

agpl-3.0

FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

9,131

apache-2.0

myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

5,117

mit

metavoiceio/metavoice-src

Foundational model for human-like, expressive TTS

3,958

apache-2.0

Camb-ai/MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

2,581

agpl-3.0

abus-aikorea/voice-pro

2,459

mit

6drf21e/ChatTTS_colab

🚀 一键部署（含离线整合包）！基于 ChatTTS ，支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用，无需复杂安装。

2,192

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

985

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

945

agpl-3.0

jishengpeng/WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

932

mit

edwko/OuteTTS

Interface for OuteTTS models.

792

apache-2.0

mezbaul-h/june

Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit

729

mit

FireRedTeam/FireRedTTS

An Open-Sourced LLM-empowered Foundation TTS System

512

mpl-2.0

lucasnewman/f5-tts-mlx

Implementation of F5-TTS in MLX

411

mit

lucidrains/e2-tts-pytorch

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

399

mit

lukaszliniewicz/Pandrator

Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos (including translation), and more. For free. Pandrator uses local models, notably XTTS, including voice-cloning (instant, RV...

373

agpl-3.0

thinhlpg/vixtts-demo

A Vietnamese Voice Cloning Text-to-Speech Model ✨

354

mpl-2.0

Aivis-Project/AivisSpeech

AivisSpeech: AI Voice Imitation System - Text to Speech Software

297

lgpl-3.0

frostming/tetos

A unified interface for multiple Text-to-Speech (TTS) providers.

251

Last 12-months (absolute gain)

RVC-Boss/GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

38,127 (+38,124)

mit

2noise/ChatTTS

A generative speech model for daily dialogue.

33,380 (+33,356)

agpl-3.0

myshell-ai/OpenVoice

Instant voice cloning by MIT and MyShell.

30,309 (+21,314)

mit

coqui-ai/TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

36,547 (+11,292)

mpl-2.0

FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

9,131 (+9,129)

apache-2.0

jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

11,394 (+8,710)

gpl-3.0

myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

5,117 (+5,114)

mit

open-mmlab/Amphion

7,999 (+5,059)

mit

rhasspy/piper

A fast, local neural text to speech system

7,238 (+4,928)

mit

rany2/edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

6,706 (+4,155)

lgpl-3.0

metavoiceio/metavoice-src

Foundational model for human-like, expressive TTS

3,958 (+3,868)

apache-2.0

k2-fsa/sherpa-onnx

3,994 (+3,571)

apache-2.0

babysor/MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

35,580 (+2,889)

Camb-ai/MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

2,581 (+2,560)

agpl-3.0

abus-aikorea/voice-pro

2,459 (+2,458)

mit

6drf21e/ChatTTS_colab

🚀 一键部署（含离线整合包）！基于 ChatTTS ，支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用，无需复杂安装。

2,192 (+2,181)

netease-youdao/EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

7,571 (+2,057)

apache-2.0

collabora/WhisperLive

A nearly-live implementation of OpenAI's Whisper.

2,248 (+1,874)

mit

espeak-ng/espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

4,378 (+1,791)

gpl-3.0

yl4579/StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

5,094 (+1,750)

mit

Last 12-months (relative gain)

2noise/ChatTTS

A generative speech model for daily dialogue.

33,380 (+138,983%)

agpl-3.0

6drf21e/ChatTTS_colab

🚀 一键部署（含离线整合包）！基于 ChatTTS ，支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用，无需复杂安装。

2,192 (+19,827%)

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

985 (+16,317%)

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

945 (+13,400%)

agpl-3.0

Camb-ai/MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

2,581 (+12,190%)

agpl-3.0

YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

401 (+6,583%)

jishengpeng/WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

932 (+4,805%)

mit

metavoiceio/metavoice-src

Foundational model for human-like, expressive TTS

3,958 (+4,298%)

apache-2.0

FireRedTeam/FireRedTTS

An Open-Sourced LLM-empowered Foundation TTS System

512 (+3,313%)

mpl-2.0

lucidrains/e2-tts-pytorch

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

399 (+2,750%)

mit

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

954 (+1,847%)

apache-2.0

met4citizen/TalkingHead

Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.

387 (+1,835%)

mit

ElmTran/praises

Praises is a text-to-speech tool that can help you read text easily.

235 (+1,708%)

mit

lucasnewman/f5-tts-mlx

Implementation of F5-TTS in MLX

411 (+1,368%)

mit

sidharthrajaram/StyleTTS2

🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning

145 (+1,350%)

jofizcd/Soul-of-Waifu

If you've ever had the wish to talk to your AI Waifu using quality characters and voices for character voicing, then I suggest Soul of Waifu. Don't miss the opportunity to touch your dream!

57 (+1,325%)

mit