Search Results - RepositoryStats

wenet wenet-e2e

1.1k

4.4k

apache-2.0

92

Production First and Production Ready End-to-End Speech Recognition Toolkit

asr pytorch whisper conformer e2e-models transformer production-ready speech-recognition automatic-speech-recognition

Created 2020-11-17

1,593 commits to main branch, last one 8 days ago

awesome-speech-recognition-speech-synthesis-papers zzw922cn

513

3.0k

mit

186

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

Created 2017-04-28

181 commits to master branch, last one about a year ago

Automatic_Speech_Recognition zzw922cn

534

2.8k

mit

145

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

cnn rnn lstm audio paper phonemes end-to-end evaluation tensorflow deep-learning timit-dataset feature-vector data-preprocessing speech-recognition layer-normalization rnn-encoder-decoder chinese-speech-recognition automatic-speech-recognition

Created 2016-11-13

266 commits to master branch, last one 3 years ago

whisper-asr-webservice ahmetoner

448

2.5k

mit

30

OpenAI Whisper ASR Webservice API

asr docker speech openai-whisper speech-to-text speech-recognition automatic-speech-recognition

Created 2022-09-22

301 commits to main branch, last one about a month ago

STT coqui-ai

283

2.4k

mpl-2.0

61

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

asr stt tensorflow deep-learning speech-to-text speech-recognizer voice-recognition speech-recognition speech-recognition-api automatic-speech-recognition

Created 2021-03-04

4,125 commits to main branch, last one 2 years ago

pororo kakaobrain

222

1.3k

apache-2.0

38

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

deep-learning neural-models speech-synthesis natural-language-processing automatic-speech-recognition

This repository has been archived (exclude archived)

Created 2021-01-28

139 commits to master branch, last one 4 years ago

TensorFlowASR TensorSpeech

245

965

apache-2.0

29

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords

ctc jasper tflite end2end conformer contextnet tensorflow deepspeech2 tensorflow2 tflite-model rnn-transducer speech-to-text tflite-convertion speech-recognition streaming-transducer subword-speech-recognition automatic-speech-recognition

Created 2020-02-13

1,124 commits to main branch, last one about a month ago

FireRedASR FireRedTeam

62

849

apache-2.0

16

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recogn...

asr llm conformer speechllm open-source transformer multimodal-llm industrial-grade speech-recognition automatic-speech-recognition

Created 2025-01-24

2 commits to main branch, last one 11 days ago

jiwer jitsi

101

708

apache-2.0

15

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

wer python3 speech-to-text word-error-rate evaluation-metrics automatic-speech-recognition

Created 2018-06-19

107 commits to master branch, last one about a month ago

whispering shirayu

52

687

mit

20

Streaming transcriber with whisper

whisper automatic-speech-recognition

This repository has been archived (exclude archived)

Created 2022-09-23

259 commits to master branch, last one about a year ago

awesome-large-audio-models EmulationAI

39

665

unknown

26

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

audio-ai music-ai speech-ai speech-llms speech-to-text audio-processing music-processing large-audio-models foundational-models large-language-models large-language-model-speech music-information-retrieval automatic-speech-recognition

Created 2023-08-18

107 commits to main branch, last one 8 months ago

cheetah Picovoice

70

619

apache-2.0

32

On-device streaming speech-to-text engine powered by deep learning

asr stt transcription speech-to-text voice-recognition speech-recognition streaming-speech-to-text online-speech-recognition automatic-speech-recognition

Created 2018-10-28

339 commits to master branch, last one 2 days ago

neural_sp hirofumi0810

139

596

apache-2.0

33

End-to-end ASR/LM implementation with PyTorch

asr ctc speech pytorch seq2seq attention streaming transformer language-model rnn-transducer transformer-xl language-modeling speech-recognition attention-mechanism sequence-to-sequence automatic-speech-recognition

Created 2017-09-10

3,218 commits to master branch, last one 3 years ago

awesome-kaldi YoavRamon

84

535

mit

25

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

kaldi speech kaldi-asr awesome-list speech-to-text speech-recognition automatic-speech-recognition

Created 2019-01-14

15 commits to master branch, last one 3 years ago

TensorflowASR Z-yq

114

472

apache-2.0

22

一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目，CPU上的实时率(RTF)小于0.1

cpp ctc bert tensorflow2 transducers transformer tensorflow-cpp state-of-the-art listen-attend-and-spell automatic-speech-recognition

Created 2019-10-29

203 commits to v2 branch, last one 24 days ago

leopard Picovoice

27

452

apache-2.0

17

On-device speech-to-text engine powered by deep learning

asr stt on-device transcription voice-to-text speech-to-text voice-recognition speech-recognition automatic-speech-recognition

Created 2020-01-14

314 commits to master branch, last one 2 days ago

huggingsound jonatasgrosman

45

446

mit

16

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

asr audio speech transformers speech-to-text speech-recognition automatic-speech-recognition

Created 2022-02-18

44 commits to main branch, last one about a year ago

speech_dataset double22a

77

410

apache-2.0

9

The dataset of Speech Recognition

asr tts wav audio speech dataset deep-learning speech-to-text text-to-speech speech-synthesis voice-conversion speech-separation speech-diarization speech-enhancement speech-recognition speech-translation speech-segmentation deep-neural-networks automatic-speech-recognition

Created 2021-04-07

72 commits to main branch, last one 3 months ago

whisper_android vilassn

63

402

mit

5

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

asr tts mobile openai tflite android offline whisper embedded tensorflow transcribe translation texttospeech transcription tensorflowlite text-to-speech speech-recognition automatic-speech-recognition

Created 2023-08-26

56 commits to master branch, last one about a month ago

whisper-youtube ArthurFDLR

114

393

mit

6

🔉 Youtube Videos Transcription with OpenAI's Whisper

whisper youtube transformer colab-notebook speech-to-text speech-recognition automatic-speech-recognition

Created 2022-10-02

21 commits to main branch, last one about a year ago

soxan m3hrdadfi

38

261

apache-2.0

8

Wav2Vec for speech recognition, classification, and audio classification

speech-recognition emotion-recognition speech-classification speech-emotion-recognition automatic-speech-recognition

Created 2021-05-25

26 commits to main branch, last one 3 years ago

deep_avsr smeetrs

43

227

mit

7

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

lip-reading speech-to-text speech-recognition visual-speech-recognition automatic-speech-recognition audio-visual-speech-recognition

Created 2019-12-07

6 commits to master branch, last one about a year ago

speechlib NavodPeiris

18

202

mit

5

speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names

ai whisper-ai transcription faster-whisper speaker-diarization speaker-recognition speaker-verification automatic-speech-recognition

Created 2024-01-07

34 commits to main branch, last one about a month ago

Hey-Jetson bricewalker

40

197

gpl-3.0

12

Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.

Created 2018-03-16

438 commits to master branch, last one 10 months ago

sova-asr sovaai

22

171

apache-2.0

13

SOVA ASR (Automatic Speech Recognition)

asr stt speech asr-model wav2letter speech-to-text speech-recognition automatic-speech-recognition

Created 2020-08-18

27 commits to master branch, last one about a year ago

FAST-RIR anton-jeran

29

163

agpl-3.0

5

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

rir speech acoustics augmentation deep-learning neural-network synthetic-data impulse-response diffuse-scattering room-impulse-response conditional-generation automatic-speech-recognition generative-adversarial-network implicit-neural-representation

Created 2021-09-30

47 commits to main branch, last one 8 months ago

elpis CoEDL

33

159

apache-2.0

15

🙊 software for creating speech recognition models.

kaldi docker python linguistics transcription computational-linguistics automatic-speech-recognition

Created 2018-10-25

1,519 commits to master branch, last one 10 months ago

spellbook-docker noco-ai

11

153

osl-3.0

4

AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models

bark llama2 xttsv2 mixtral whisper llm-inference text-to-speech musicgeneration stable-diffusion automatic-speech-recognition

Created 2023-11-06

18 commits to master branch, last one 11 months ago

mongolian-speech-recognition tugstugi

52

134

unknown

31

Mongolian speech recognition with PyTorch

asr python pytorch mongolian deep-learning speech-to-text speech-recognition automatic-speech-recognition convolutional-neural-networks

Created 2018-09-11

132 commits to master branch, last one 4 years ago

at16k at16k

18

129

mit

11

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

asr asr-model speech-api speech-to-text voice-commands speech-analysis pretrained-models speech-recognizer voice-recognition speech-recognition automatic-speech-recognition

Created 2019-12-03

27 commits to master branch, last one 4 years ago