Trending repositories for topic audio-processing
Cross-platform, customizable ML solutions for live and streaming media.
The collection of pre-trained, state-of-the-art AI models for ailia SDK
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Open-Source Large Vocabulary Continuous Speech Recognition Engine
:musical_note: :rainbow: Real-time LED strip music visualization using Python and the ESP8266 or Raspberry Pi
Digital Audio Workstation with Python; VST instruments/effects, parameter automation, FAUST, JAX, Warp Markers, and JUCE processors
Checkrr Scans your library files for corrupt media and optionally replaces the files via sonarr and radarr
Make VST2 / VST3 / AU / AAX / CLAP / LV2 / FLP plug-ins for Linux/macOS/Windows, using D.
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
Flutter low-level audio plugin using SoLoud C++ library and FFI
Checkrr Scans your library files for corrupt media and optionally replaces the files via sonarr and radarr
unofficial implementation of the High Fidelity Neural Audio Compression
Speech, Language, Audio, Music Processing with Large Language Model
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Make VST2 / VST3 / AU / AAX / CLAP / LV2 / FLP plug-ins for Linux/macOS/Windows, using D.
This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, musi...
Digital Audio Workstation with Python; VST instruments/effects, parameter automation, FAUST, JAX, Warp Markers, and JUCE processors
⏩ Fast-forwards long pauses between sentences — watch lectures ~1.5x faster (browser extension)
The collection of pre-trained, state-of-the-art AI models for ailia SDK
Cross-platform, customizable ML solutions for live and streaming media.
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
LedFx is a network based LED effect engine designed to deliver advanced real-time audio effects to a wide variety of devices.
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
PipeWire Guide. Learn about how PipeWire gives your Linux system a Professional Audio/Video Processing workflow.
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
The collection of pre-trained, state-of-the-art AI models for ailia SDK
Checkrr Scans your library files for corrupt media and optionally replaces the files via sonarr and radarr
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
Example effects code and binaries for the Cleveland Music Co. Hothouse Digital Signal Processing Pedal Kit
an architecture for neural network inference in real-time audio applications
Checkrr Scans your library files for corrupt media and optionally replaces the files via sonarr and radarr
Flutter low-level audio plugin using SoLoud C++ library and FFI
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
unofficial implementation of the High Fidelity Neural Audio Compression
A simple Python wrapper for audio noise reduction RNNoise. Simplifies work with it, adds new trained models and detailed instructions for training.
🎵 🎹 Firmware boilerplate for the RP2040 / RP2350 powered PicoADK Audio Development Boards. Build your own stand alone synthesizers! Includes all nuts and bolts (FreeRTOS, USB MIDI, Vult DSP, Hardwar...
A Teensy 3.x/4.x based polyphonic synthesizer, modelled after the Juno-106
Cross-platform, customizable ML solutions for live and streaming media.
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Isolate vocals, drums, bass, and other instrumental stems from any song
LedFx is a network based LED effect engine designed to deliver advanced real-time audio effects to a wide variety of devices.
The collection of pre-trained, state-of-the-art AI models for ailia SDK
PipeWire Guide. Learn about how PipeWire gives your Linux system a Professional Audio/Video Processing workflow.
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
An implementation of the system-wide JamesDSP audio processing engine for non-rooted Android devices
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
an architecture for neural network inference in real-time audio applications
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation, clipping, equalization (EQ) distortion, packet loss, codec lo...
A Web and Native UI for ffmpeg-wasm: convert video, audio and images using the power of ffmpeg, directly from your web browser or from your computer.
[EMNLP2024 Demo] A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words...
Checkrr Scans your library files for corrupt media and optionally replaces the files via sonarr and radarr
Example effects code and binaries for the Cleveland Music Co. Hothouse Digital Signal Processing Pedal Kit
Single- and Multi-Speaker Cloned Voice Detection: From Perceptual to Learned Features
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
Open source Python program for automating gain staging. part 1 of a series for automating audio processing tasks, end goal is to create a full set of tools for an AI to use for automating Audio proces...
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.
A high-performance, "quantum-inspired" Fast Fourier Transform (FFT) library written in pure and safe Rust.
Easily train a good VC model with voice data <= 10 mins!
an architecture for neural network inference in real-time audio applications
A C++ based, lightweight music and noise remover for YouTube and other internet media, using DeepFilterNet for audio enhancement.
Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words...
Audio time stretch and pitch shift library. Enables music tempo adjustment, transposition, "smooth scrub" and "live pause".
[EMNLP2024 Demo] A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
A collection amazing audio tools for working with audio and sound files in comfyUI
Cross-platform, customizable ML solutions for live and streaming media.
A library for audio and music analysis, feature extraction.
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
PipeWire Guide. Learn about how PipeWire gives your Linux system a Professional Audio/Video Processing workflow.
Isolate vocals, drums, bass, and other instrumental stems from any song
An implementation of the system-wide JamesDSP audio processing engine for non-rooted Android devices
The collection of pre-trained, state-of-the-art AI models for ailia SDK
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Audio time stretch and pitch shift library. Enables music tempo adjustment, transposition, "smooth scrub" and "live pause".
A Web and Native UI for ffmpeg-wasm: convert video, audio and images using the power of ffmpeg, directly from your web browser or from your computer.
[EMNLP2024 Demo] A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
A music theory library in Rust for generating songs🎶
Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Fast audio player, recorder, converter for Windows, Linux & Android
Easily train a good VC model with voice data <= 10 mins!
This free tool transforms your books, textbooks, or any text document into fantastic sounding audiobooks using OpenAI's state-of-the-art TTS technology.
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of...
Flutter low-level audio plugin using SoLoud C++ library and FFI
Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.