Trending repositories for topic audio-processing
Cross-platform, customizable ML solutions for live and streaming media.
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
An implementation of the system-wide JamesDSP audio processing engine for non-rooted Android devices
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
A Teensy 3.x/4.x based polyphonic synthesizer, modelled after the Juno-106
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
⏩ Fast-forwards long pauses between sentences — watch lectures ~1.5x faster (browser extension)
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
An implementation of the system-wide JamesDSP audio processing engine for non-rooted Android devices
A Teensy 3.x/4.x based polyphonic synthesizer, modelled after the Juno-106
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
⏩ Fast-forwards long pauses between sentences — watch lectures ~1.5x faster (browser extension)
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
A library for audio and music analysis, feature extraction.
Cross-platform, customizable ML solutions for live and streaming media.
🎵 [Android Library] A light-weight and easy-to-use Audio Visualizer for Android.
Cross-platform, customizable ML solutions for live and streaming media.
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
An implementation of the system-wide JamesDSP audio processing engine for non-rooted Android devices
Digital Audio Workstation with Python; VST instruments/effects, parameter automation, FAUST, JAX, Warp Markers, and JUCE processors
PipeWire Guide. Learn about how PipeWire gives your Linux system a Professional Audio/Video Processing workflow.
Data manipulation and transformation for audio signal processing, powered by PyTorch
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf
an architecture for neural network inference in real-time audio applications
Digital Multi-Effect Pedal with Reverb, Delay, Tremolo, Looper, and Neural Networks for Amp Modeling
Speaker change detection using SincNet and an LSTM/Transformer
Fast audio player, recorder, converter for Windows, Linux & Android
Easily train a good VC model with voice data <= 10 mins!
An implementation of the system-wide JamesDSP audio processing engine for non-rooted Android devices
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
Coursera Course: Introduction to Programming 👩💻 with MATLAB ~by Vanderbilt University 🎓
STFT based real-time pitch and timbre shifting in C++ and Python
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
Cross-platform, customizable ML solutions for live and streaming media.
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
An implementation of the system-wide JamesDSP audio processing engine for non-rooted Android devices
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
PipeWire Guide. Learn about how PipeWire gives your Linux system a Professional Audio/Video Processing workflow.
Isolate vocals, drums, bass, and other instrumental stems from any song
A Web and Native UI for ffmpeg-wasm: convert video, audio and images using the power of ffmpeg, directly from your web browser or from your computer.
A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
A curated list of awesome AI tools for music composition, generation, enhancement, and more.
Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation, clipping, equalization (EQ) distortion, packet loss, codec lo...
an architecture for neural network inference in real-time audio applications
Fast audio player, recorder, converter for Windows, Linux & Android
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of...
Speaker change detection using SincNet and an LSTM/Transformer
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
A high-performance, "quantum-inspired" Fast Fourier Transform (FFT) library written in pure and safe Rust.
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.
Easily train a good VC model with voice data <= 10 mins!
WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words...
Audio time stretch and pitch shift library. Enables music tempo adjustment, transposition, "smooth scrub" and "live pause".
an architecture for neural network inference in real-time audio applications
Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
A curated list of awesome AI tools for music composition, generation, enhancement, and more.
A collection amazing audio tools for working with audio and sound files in comfyUI
Cross-platform, customizable ML solutions for live and streaming media.
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Wunjo CE: Face Swap, Lip Sync, Control Remove Objects & Text & Background, Restyling, Audio Separator, Clone Voice, Video Generation. Open Source, Local & Free.
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
PipeWire Guide. Learn about how PipeWire gives your Linux system a Professional Audio/Video Processing workflow.
Isolate vocals, drums, bass, and other instrumental stems from any song
An implementation of the system-wide JamesDSP audio processing engine for non-rooted Android devices
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
The collection of pre-trained, state-of-the-art AI models for ailia SDK
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Audio time stretch and pitch shift library. Enables music tempo adjustment, transposition, "smooth scrub" and "live pause".
A music theory library in Rust for generating songs🎶
A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Easily train a good VC model with voice data <= 10 mins!
This free tool transforms your books, textbooks, or any text document into fantastic sounding audiobooks using OpenAI's state-of-the-art TTS technology.
ez audio transcription tool with flexible processing and post-processing options
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of...
Wunjo CE: Face Swap, Lip Sync, Control Remove Objects & Text & Background, Restyling, Audio Separator, Clone Voice, Video Generation. Open Source, Local & Free.
Flutter low-level audio plugin using SoLoud C++ library and FFI
Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.