Statistics for topic text-to-speech
RepositoryStats tracks 518,325 Github repositories, of these 344 are tagged with the text-to-speech topic. The most common primary language for repositories using this topic is Python (186). Other languages include: JavaScript (25), Jupyter Notebook (24), TypeScript (18), C++ (15), C# (12), Java (11)
Stargazers over time for topic text-to-speech
Most starred repositories for topic text-to-speech (view more)
Trending repositories for topic text-to-speech (view more)
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
A web UI Project In order to learn the large language model. This project includes features such as chat, quantization, fine-tuning, prompt engineering templates, and multimodality.
Pandrator aspires to be a user-friendly app with a graphical interface and a one-click installer that creates high-quality speech from text in multiple languages (audiobooks, speech synchronised with ...
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
Simple Python script to interact with the TikTok TTS Voices.
It is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言(97种语言)混合文本内容自动分词工具。
Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
Pandrator aspires to be a user-friendly app with a graphical interface and a one-click installer that creates high-quality speech from text in multiple languages (audiobooks, speech synchronised with ...
It is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言(97种语言)混合文本内容自动分词工具。
Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models