Search Results - RepositoryStats

2.1k

27.5k

mit

193

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, tr...

Created 2023-03-18

3,300 commits to master branch, last one 7 hours ago

CosyVoice FunAudioLLM

882

9.1k

apache-2.0

82

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

tts gpt-4o korean python chatbot chatgpt chinese english japanese cantonese cosyvoice fine-tuning fine-grained cross-lingual multi-lingual voice-cloning text-to-speech audio-generation natural-language-generation

Created 2024-07-03

267 commits to main branch, last one 23 hours ago

Amphion open-mmlab

605

8.0k

mit

76

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...

vits audit emilia vall-e maskgct vocoder audioldm fastspeech2 text-to-audio naturalspeech2 text-to-speech audio-synthesis audio-generation music-generation speech-synthesis voice-conversion singing-voice-conversion

Created 2023-11-15

114 commits to main branch, last one 2 days ago

AudioLDM haoheliu

227

2.5k

other

44

AudioLDM: Generate speech, sound effects, music and beyond, with text.

audio-generation

Created 2023-01-29

107 commits to main branch, last one 26 days ago

AudioLDM2 haoheliu

184

2.3k

other

45

Text-to-Audio/Music Generation

audio-generation

Created 2023-08-04

86 commits to main branch, last one 3 months ago

audio-diffusion-pytorch archinetai

168

2.0k

mit

40

Audio generation using diffusion models, in PyTorch.

deep-learning audio-generation denoising-diffusion artificial-intelligence

Created 2022-07-07

188 commits to main branch, last one about a year ago

tts-generation-webui rsxdalv

208

1.9k

mit

35

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)

Created 2023-04-27

239 commits to main branch, last one 18 days ago

audio-ai-timeline archinetai

70

1.9k

unknown

168

A timeline of the latest AI models for audio generation, starting in 2023!

audio-generation machine-learning artificial-intelligence

Created 2023-01-29

59 commits to main branch, last one about a year ago

soundstorm-pytorch lucidrains

91

1.5k

mit

51

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

transformers deep-learning audio-generation non-autoregressive attention-mechanism artificial-intelligence

Created 2023-05-17

94 commits to main branch, last one 2 months ago

tango declare-lab

94

1.1k

other

28

A family of diffusion models for text-to-audio generation.

diffusion text-to-audio language-models audio-generation diffusion-models large-language-models

Created 2023-04-10

133 commits to master branch, last one 5 days ago

BigVGAN NVIDIA

111

929

mit

71

Official PyTorch implementation of BigVGAN (ICLR 2023)

neural-vocoder audio-synthesis music-synthesis audio-generation speech-synthesis singing-voice-synthesis

Created 2022-06-07

47 commits to main branch, last one 4 months ago

ai-audio-datasets Yuan-ManX

40

575

mit

13

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio app...

aigc audio datasets audio-effect deep-learning audio-generation machine-learning music-generation artificial-intelligence

Created 2022-12-18

183 commits to main branch, last one about a month ago

MM-Diffusion researchmm

22

404

mit

6

[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

multi-modality audio-generation content-creation diffusion-models video-generation

Created 2022-12-11

16 commits to main branch, last one 7 months ago

FunCodec modelscope

31

378

mit

16

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

tts codec encodec voicecloning speech-to-text audio-generation speech-synthesis audio-quantization

Created 2023-10-07

71 commits to master branch, last one about a year ago

awesome-audio-plaza metame-ai

13

366

mit

35

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

asr tts awesome zero-shot-tts audio-generation music-generation awesome-music-generation

Created 2024-02-03

138 commits to main branch, last one a day ago

SpecVQGAN v-iashin

39

354

mit

8

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

gan vas bmvc audio video vqvae melgan pytorch vggsound multi-modal transformer video-features audio-generation evaluation-metrics video-understanding

Created 2021-10-17

24 commits to main branch, last one 5 months ago

audio-development-tools Yuan-ManX

21

320

mit

12

This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, musi...

dsp audio music speech deep-learning audio-generation audio-processing machine-learning music-generation speech-synthesis signal-processing speech-processing artificial-intelligence

Created 2022-09-14

526 commits to main branch, last one 3 months ago

InspireMusic FunAudioLLM

25

285

apache-2.0

14

InspireMusic: A Unified Framework for Music, Song, Audio Generation.

pytorch audio-generation audio-processing music-generation

Created 2024-10-29

88 commits to main branch, last one 8 days ago

modular-diffusion cabralpinto

12

267

mit

8

Python library for designing and training your own Diffusion Models with PyTorch.

u-net python pytorch transformer deep-learning modular-design text-generation audio-generation diffusion-models image-generation machine-learning

Created 2023-06-22

47 commits to main branch, last one 5 months ago

bigvsan sony

16

201

mit

29

Pytorch implementation of BigVSAN

gan pytorch neural-vocoder audio-synthesis audio-generation speech-synthesis

Created 2023-09-01

16 commits to main branch, last one 9 months ago

Catch-A-Waveform galgreshler

35

189

other

4

Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)

gan raw-waveforms single-example audio-denoising audio-generation audio-inpainting music-generation speech-synthesis bandwidth-extension audio-super-resolution

Created 2021-05-23

41 commits to main branch, last one 9 months ago

awesome-sound_event_detection soham97

9

168

mit

8

Reading list for research topics in Sound AI

icassp interspeech audio-retrieval audio-captioning audio-generation audio-processing zero-shot-learning sound-event-detection representation-learning acoustic-scene-classification

Created 2020-11-28

62 commits to main branch, last one 4 months ago

Auffusion happylittlecat2333

13

166

other

9

Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"

diffusion text-to-audio audio-generation diffusion-models large-language-models

Created 2023-11-15

18 commits to main branch, last one 9 months ago

audio-data-pytorch archinetai

22

137

mit

5

A collection of useful audio datasets and transforms for PyTorch.

pytorch datasets deep-learning audio-generation artifical-intelligense

Created 2022-07-24

30 commits to main branch, last one about a year ago

audio-diffusion-pytorch-trainer archinetai

23

128

mit

6

Trainer for audio-diffusion-pytorch

deep-learning audio-generation denoising-diffusion artificial-intelligence

Created 2022-08-19

340 commits to main branch, last one about a year ago

word2wave ilaria-manco

15

119

mit

3

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

ai-music text-to-audio audio-generation music-generation

Created 2021-04-20

80 commits to main branch, last one 3 years ago

neuralnoise leopiney

10

119

mit

4

The AI Podcast Studio: generate podcasts scripts and their audio version with a team of AI workers in a Podcast Studio 🎙️📜

ai llms openai autogen podcast elevenlabs notebooklm audio-generation podcast-generator

Created 2024-10-11

35 commits to main branch, last one 9 days ago

im2wav RoySheffer

10

110

mit

3

Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation

audio pytorch image-to-audio video-to-audio audio-generation machine-learning

Created 2022-10-29

3 commits to main branch, last one about a year ago

soundctm sony

6

75

mit

3

Pytorch implementation of SoundCTM

pytorch text-to-audio audio-generation diffusion-models

Created 2024-06-04

23 commits to main branch, last one about a month ago

bark-speaker-directory rsxdalv

0

49

mit

4

Site for sharing Bark voices

ai tts web bark deep-learning text-to-speech audio-generation machine-learning

Created 2023-05-05

37 commits to main branch, last one 6 months ago