119 results found Sort:

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Created 2016-03-07
2,400 commits to main branch, last one 6 months ago
Reading list for research topics in multimodal machine learning
Created 2019-05-27
435 commits to master branch, last one 9 months ago
526
5.4k
mit
54
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Created 2020-11-23
458 commits to master branch, last one 7 days ago
Foundation Architecture for (M)LLMs
Created 2022-11-17
123 commits to main branch, last one 11 months ago
499
2.4k
other
95
WaveNet vocoder
Created 2017-12-27
261 commits to master branch, last one 4 years ago
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Created 2017-10-31
221 commits to master branch, last one about a year ago
232
1.7k
apache-2.0
75
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Created 2019-01-19
123 commits to master branch, last one 5 months ago
AI powered speech denoising and enhancement
Created 2023-11-15
13 commits to main branch, last one 3 months ago
179
1.6k
apache-2.0
22
Controllable and fast Text-to-Speech for over 7000 languages!
Created 2021-08-05
3,161 commits to MassiveScaleToucan branch, last one 4 months ago
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Created 2019-01-31
139 commits to master branch, last one 2 years ago
134
1.1k
mit
17
General Speech Restoration
Created 2021-09-06
99 commits to main branch, last one about a month ago
Open source audio annotation tool for humans
Created 2019-10-03
259 commits to main branch, last one about a month ago
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Created 2024-06-04
25 commits to main branch, last one 7 months ago
248
793
apache-2.0
22
You can find the speech algorithms you want here
Created 2020-05-11
139 commits to master branch, last one 2 months ago
76
766
mit
23
Speech, Language, Audio, Music Processing with Large Language Model
Created 2023-10-23
886 commits to main branch, last one 25 days ago
A neural network for end-to-end speech denoising
Created 2017-06-19
3 commits to master branch, last one 7 years ago
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
Created 2024-05-24
9 commits to main branch, last one 3 months ago
161
605
mit
9
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
Created 2020-05-11
101 commits to master branch, last one 2 years ago
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Created 2021-07-19
22 commits to main branch, last one 2 years ago
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Created 2020-12-18
104 commits to main branch, last one about a year ago
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
Created 2021-11-04
109 commits to main branch, last one 4 months ago
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
Created 2021-03-05
1,258 commits to main branch, last one about a year ago
102
474
mit
66
Speech recognition toolkit for the arduino
This repository has been archived (exclude archived)
Created 2012-08-12
133 commits to 4.x-workingBranch branch, last one 3 years ago
79
468
bsd-3-clause
11
:sound: spafe: Simplified Python Audio Features Extraction
Created 2019-09-16
377 commits to master branch, last one 11 days ago
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
Created 2020-06-16
80 commits to master branch, last one 4 years ago
74
454
other
18
UniSpeech - Large Scale Self-Supervised Learning for Speech
Created 2021-07-14
73 commits to main branch, last one 12 months ago
78
441
other
22
A python wrapper for Speech Signal Processing Toolkit (SPTK).
Created 2015-08-30
359 commits to master branch, last one 8 months ago