Search Results - RepositoryStats

790

7.4k

other

69

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

vad rnnt dfsmn pytorch whisper conformer speechgpt speechllm paraformer punctuation pretrained-model speech-recognition speaker-diarization voice-activity-detection audio-visual-speech-recognition

Created 2022-11-24

4,732 commits to main branch, last one 17 hours ago

ffsubsync smacke

283

6.9k

mit

77

Automagically synchronize subtitles with video.

Created 2019-02-24

377 commits to master branch, last one 24 days ago

silero-vad snakers4

443

4.6k

mit

48

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

vad onnx speech pytorch onnxruntime onnx-runtime voice-control voice-commands voice-detection speech-processing voice-recognition voice-activity-detection

Created 2020-11-23

447 commits to master branch, last one 26 days ago

faster-whisper-GUI CheshireCC

109

1.8k

agpl-3.0

15

faster_whisper GUI with PySide6

asr vad openai whisper whisperx transcribe faster-whisper voice-transcription

Created 2023-07-18

112 commits to main branch, last one 15 days ago

sherpa-ncnn k2-fsa

162

1.1k

apache-2.0

32

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, Lich...

c go asr cpp vad csharp kotlin python speech-recognition voice-activity-detection

Created 2022-09-04

197 commits to master branch, last one 3 months ago

VAD jtkim-kaist

235

845

unknown

44

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

dnn vad acam bdnn data lstm speech attention voice-detection speech-recognition voice-activity-detection speech-activity-detection

Created 2017-04-18

115 commits to master branch, last one 3 years ago

auditok amsehili

96

751

mit

26

An audio/acoustic activity detection and audio segmentation tool

vad audio-data voice-detection audio-activities audio-segmentation voice-activity-detection

Created 2015-09-17

438 commits to main branch, last one 10 days ago

ICASSP-2023-24-Papers DmitryRyumin

17

418

mit

29

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processi...

Created 2023-08-01

975 commits to main branch, last one 24 hours ago

voice_activity_detection filippogiruzzi

70

357

gpl-3.0

12

Voice Activity Detection based on Deep Learning & TensorFlow

vad python resnet speech tensorflow librispeech time-series deeplearning deep-learning mfcc-features machine-learning speech-detection speech-recognition librispeech-dataset deep-neural-networks artificial-intelligence voice-activity-detection time-series-classification

Created 2019-12-11

37 commits to master branch, last one 3 years ago

RuntimeAudioImporter gtreshchev

71

355

mit

8

Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.

Created 2020-12-10

420 commits to main branch, last one 14 days ago

WhisperS2T shashikg

36

334

mit

14

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

asr vad whisper tensorrt tensorrt-llm deep-learning speech-to-text speech-recognition voice-activity-detection

Created 2023-12-16

81 commits to main branch, last one 3 months ago

WhisperHallu EtienneAb3d

22

290

unknown

11

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts

asr vad vocals whisper noise-removal text-to-speech audio-processing sound-processing

Created 2023-02-14

68 commits to main branch, last one about a month ago

android-vad gkonovalov

61

270

mit

9

Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

Created 2019-11-28

220 commits to main branch, last one 21 days ago

cobra Picovoice

11

183

apache-2.0

12

On-device voice activity detection (VAD) powered by deep learning

vad on-device voice-activity speech-recognition voice-activity-detector voice-activity-detection

Created 2021-09-14

280 commits to main branch, last one about a month ago

python-webrtc-audio-processing xiongyihui

52

178

unknown

8

Python bindings of WebRTC Audio Processing

ns agc vad python webrtc-audio-processing

Created 2017-02-24

24 commits to master branch, last one 3 months ago

voice-activity-detection voithru

27

150

mit

5

Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

vad voice-activity-detection

Created 2021-02-22

10 commits to main branch, last one 3 years ago

sic 0vercl0k

22

117

mit

12

Enumerate user mode shared memory mappings on Windows.

shm vad driver ntoskrnl windows-10 prototype-pte shared-memory windows-kernel

Created 2020-01-24

91 commits to master branch, last one 3 years ago

object_centric_VAD fjchange

30

98

mit

8

An Tensorflow Re-Implement of CVPR 2019 "Object-centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video"

vad anomaly cvpr2019

Created 2019-08-05

70 commits to master branch, last one 2 years ago

webrtc_apm xia-chu

45

97

unknown

4

webrtc中apm相关代码的提取，包括AEC/NS/AGC/VAD ，另外还包括mp3/aac编码器、SoundTouch

ns aac aec agc jni mp3 vad webrtc soundtouch

Created 2018-11-26

25 commits to master branch, last one about a year ago

voxseg NickWilkinson37

12

83

mit

5

A python library for voice activity detection (VAD) for speech/non-speech segmentation.

vad python speech python-library speech-processing speech-segmentation voice-activity-detection

Created 2021-01-12

68 commits to master branch, last one 3 years ago

karaok-AI EtienneAb3d

1

65

unknown

2

Karaoke Player / Editor with automatic clip creation from any song file using vocals and lyrics extraction (Speech-to-Text)

vad djing music lyrics karaoke whisper subtitles mp3-player party-apps karaoke-maker srt-subtitles speech-to-text sound-processing

Created 2023-03-05

29 commits to main branch, last one about a year ago

aria lef-fan

14

54

agpl-3.0

5

A local and uncensored AI entity.

ai bot llm vad python speech pytorch assistant deep-learning speech-to-text text-to-speech voice-assistant large-language-models

Created 2024-02-02

88 commits to main branch, last one 2 months ago

whisper_ros mgonzs13

14

54

mit

4

Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2

vad ggml ros2 whisper-cpp speech-to-text speech-recognition voice-activity-detection

Created 2023-05-01

185 commits to main branch, last one 5 days ago

voice-activity-detection-unity mochi-neko

6

53

mit

1

A voice activity detection (VAD) library for Unity.

vad unity

Created 2023-06-28

57 commits to main branch, last one about a year ago

HadreamAssistant HadreamOrg

5

45

unknown

0

HadreamAssistant, 你的智能家居/自定义语音助手, 支持树莓派/Linux

ai bot vad talk notion python3 snowboy

Created 2021-11-14

45 commits to main branch, last one 5 months ago

End-to-End-Speech-Recognition-Models sooftware

5

38

apache-2.0

3

PyTorch implementation of automatic speech recognition models.

asr e2e las vad pytorch end-to-end deepspeech2 transformer acoustic-model listen-attend-and-spell voice-activity-detection

Created 2020-11-28

18 commits to main branch, last one 3 years ago

mod_vadasr shanghaimoon888

27

37

unknown

1

This is FreeSwitch module that can do VAD and ASR with IFLYTEK websocket api.

asr vad freeswitch freeswitch-esl freeswitch-plugin

Created 2022-07-01

7 commits to main branch, last one 2 years ago