Trending repositories for topic voice-conversion
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Easily train a good VC model with voice data <= 10 mins!
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
A simple, high-quality voice conversion tool focused on ease of use and performance
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
so-vits-svc fork with realtime support, improved interface and more features.
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
A simple, high-quality voice conversion tool focused on ease of use and performance
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
Easily train a good VC model with voice data <= 10 mins!
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
so-vits-svc fork with realtime support, improved interface and more features.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Easily train a good VC model with voice data <= 10 mins!
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
A simple, high-quality voice conversion tool focused on ease of use and performance
so-vits-svc fork with realtime support, improved interface and more features.
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
singing voice change based on whisper, and lora for singing voice clone
可本地部署的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
The code for the bark-voicecloning model. Training and inference.
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
A simple, high-quality voice conversion tool focused on ease of use and performance
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Easily train a good VC model with voice data <= 10 mins!
HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
singing voice change based on whisper, and lora for singing voice clone
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
可本地部署的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
The code for the bark-voicecloning model. Training and inference.
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
so-vits-svc fork with realtime support, improved interface and more features.
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Easily train a good VC model with voice data <= 10 mins!
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
A simple, high-quality voice conversion tool focused on ease of use and performance
so-vits-svc fork with realtime support, improved interface and more features.
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
可本地部署的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
🤖 + 🐳 + 🐧 Monadic Chat is a locally hosted web app for creating intelligent chatbots, available for Mac, Windows, and Linux. It offers a Linux environment on Docker for GPT and other LLMs, enabling...
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
singing voice change based on whisper, and lora for singing voice clone
The code for the bark-voicecloning model. Training and inference.
Easily train a good VC model with voice data <= 10 mins!
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
🤖 + 🐳 + 🐧 Monadic Chat is a locally hosted web app for creating intelligent chatbots, available for Mac, Windows, and Linux. It offers a Linux environment on Docker for GPT and other LLMs, enabling...
🚀 RVC + UVR = A perfect set of tools for voice cloning, easily and free!
An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"
A simple, high-quality voice conversion tool focused on ease of use and performance
Easily train a good VC model with voice data <= 10 mins!
可本地部署的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Easily train a good VC model with voice data <= 10 mins!
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
singing voice change based on whisper, and lora for singing voice clone
ChatGPT Voice Chatbot Telegram is a Python and Flask-based GitHub repository that enables users to communicate with an AI chatbot using voice-to-text and text-to-voice technologies powered by OpenAI. ...
The code for the bark-voicecloning model. Training and inference.
Retrieval-based Voice Conversion (RVC) implemented with Hugging Face Transformers.
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
Easily train a good VC model with voice data <= 10 mins!
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Easily train a good VC model with voice data <= 10 mins!
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
A simple, high-quality voice conversion tool focused on ease of use and performance
so-vits-svc fork with realtime support, improved interface and more features.
可本地部署的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
The code for the bark-voicecloning model. Training and inference.
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
🚀 RVC + UVR = A perfect set of tools for voice cloning, easily and free!
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
可本地部署的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
A simple, high-quality voice conversion tool focused on ease of use and performance
Easily train a good VC model with voice data <= 10 mins!
🤖 + 🐳 + 🐧 Monadic Chat is a locally hosted web app for creating intelligent chatbots, available for Mac, Windows, and Linux. It offers a Linux environment on Docker for GPT and other LLMs, enabling...
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Advanced RVC Inference for quicker and effortless model downloads
Easily train a good VC model with voice data <= 10 mins!
Retrieval-based Voice Conversion (RVC) implemented with Hugging Face Transformers.
RVC Inference with multiple model and huggingface support
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)
Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion