Trending repositories for topic asr

Last 3 days (new repositories)

no newly created repositories trending in the last 3 days

Last 3 days (absolute gain)

TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compatib...

3,587 (+271)

apache-2.0

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

12,958 (+54)

bsd-2-clause

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper API，使用本地运行的Whisper模型进行推理，并支持多GPU并发，针对分布式部署进行设计。还内置了包括TikTok、抖音等社交媒体平台的爬虫，可实现来自多个社交平台的无缝媒体处理，为媒体内容数据自动化处理提供了强大且可扩展的解决方案。

257 (+52)

apache-2.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+44)

NexaAI/nexa-sdk

Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR...

5,096 (+35)

apache-2.0

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

8,337 (+34)

apache-2.0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+25)

bsd-2-clause

PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation a...

11,291 (+24)

apache-2.0

NVIDIA/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

12,487 (+20)

apache-2.0

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

9,088 (+17)

apache-2.0

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+13)

jdepoix/youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless b...

3,168 (+13)

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

913 (+11)

agpl-3.0

k2-fsa/sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V,...

3,877 (+11)

apache-2.0

CheshireCC/faster-whisper-GUI

faster_whisper GUI with PySide6

1,819 (+10)

agpl-3.0

Henry-23/VideoChat

实时语音交互数字人，支持端到端语音方案（GLM-4-Voice - THG）和级联方案（ASR-LLM-TTS-THG）。可自定义形象与音色，无须训练，支持音色克隆，首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cas...

551 (+7)

mit

xiangyuecn/Recorder

html5 js 录音 mp3 wav ogg webm amr g711a g711u 格式，支持pc和Android、iOS部分浏览器、Hybrid App（提供Android iOS App源码）、微信，提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码

4,955 (+7)

mit

wzpan/wukong-robot

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

6,451 (+6)

mit

linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

2,130 (+6)

agpl-3.0

PeterH0323/Streamer-Sales

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、F...

2,697 (+4)

agpl-3.0

Last 3 days (relative gain)

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+25%)

apache-2.0

TEN-framework/TEN-Agent

3,587 (+8%)

apache-2.0

tsmdt/whisply

💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... Fast!!

25 (+4%)

apache-2.0

wwbin2017/bailing

百聆是一个类似GPT-4o的语音对话机器人，通过ASR+LLM+TTS实现，时延低至800ms，低配置也可运行，支持打断

54 (+2%)

mit

Henry-23/VideoChat

551 (+1%)

mit

nyrahealth/CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

478 (+1%)

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

913 (+1%)

agpl-3.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+1%)

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+0.9%)

vilassn/whisper_android

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

272 (+0.7%)

mit

NexaAI/nexa-sdk

5,096 (+0.7%)

apache-2.0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+0.6%)

bsd-2-clause

CheshireCC/faster-whisper-GUI

faster_whisper GUI with PySide6

1,819 (+0.6%)

agpl-3.0

DmitryRyumin/ICASSP-2023-24-Papers

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processi...

418 (+0.5%)

mit

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

12,958 (+0.4%)

bsd-2-clause

ycyy/faster-whisper-webui

a gradio webui for faster whisper

240 (+0.4%)

apache-2.0

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

8,337 (+0.4%)

apache-2.0

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

927 (+0.3%)

apache-2.0

k2-fsa/sherpa-onnx

3,877 (+0.3%)

apache-2.0

linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

2,130 (+0.3%)

agpl-3.0

Last week (new repositories)

no newly created repositories trending in the last week

Last week (absolute gain)

TEN-framework/TEN-Agent

3,587 (+1,237)

apache-2.0

NexaAI/nexa-sdk

5,096 (+173)

apache-2.0

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+106)

apache-2.0

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

12,958 (+98)

bsd-2-clause

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+83)

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

8,337 (+57)

apache-2.0

k2-fsa/sherpa-onnx

3,877 (+48)

apache-2.0

NVIDIA/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

12,487 (+46)

apache-2.0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+39)

bsd-2-clause

PaddlePaddle/PaddleSpeech

11,291 (+36)

apache-2.0

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

9,088 (+31)

apache-2.0

jdepoix/youtube-transcript-api

3,168 (+30)

mit

CheshireCC/faster-whisper-GUI

faster_whisper GUI with PySide6

1,819 (+24)

agpl-3.0

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+23)

Henry-23/VideoChat

551 (+19)

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

913 (+16)

agpl-3.0

nyrahealth/CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

478 (+15)

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

927 (+13)

apache-2.0

wzpan/wukong-robot

6,451 (+13)

mit

linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

2,130 (+13)

agpl-3.0

Last week (relative gain)

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+70%)

apache-2.0

TEN-framework/TEN-Agent

3,587 (+53%)

apache-2.0

wwbin2017/bailing

百聆是一个类似GPT-4o的语音对话机器人，通过ASR+LLM+TTS实现，时延低至800ms，低配置也可运行，支持打断

54 (+10%)

mit

Alannikos/FunGPT

In this fast-paced world, we all need a little something to spice up life. Whether you need a glass of sweet talk to lift your spirits or a dose of sharp retorts to let off steam, FunGPT has got you c...

33 (+10%)

mit

tsmdt/whisply

💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... Fast!!

25 (+9%)

apache-2.0

pavelzbornik/whisperX-FastAPI

FastAPI service on top of WhisperX

56 (+8%)

Henry-23/VideoChat

551 (+4%)

mit

NexaAI/nexa-sdk

5,096 (+4%)

apache-2.0

nyrahealth/CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

478 (+3%)

JosefAlbers/whisper-turbo-mlx

Blazing fast whisper turbo for ASR (speech-to-text) tasks

180 (+3%)

mit

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+2%)

litongjava/whisper-cpp-server

whisper-cpp-serve Real-time speech recognition and c+ of OpenAI's Whisper model in C/C++

54 (+2%)

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

913 (+2%)

agpl-3.0

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+2%)

revdotcom/reverb

Open source inference code for Rev's model

347 (+1%)

apache-2.0

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

927 (+1%)

apache-2.0

CheshireCC/faster-whisper-GUI

faster_whisper GUI with PySide6

1,819 (+1%)

agpl-3.0

k2-fsa/sherpa-onnx

3,877 (+1%)

apache-2.0

fgnt/meeteval

MeetEval - A meeting transcription evaluation toolkit

82 (+1%)

mit

vilassn/whisper_android

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

272 (+1%)

mit

Last month (new repositories)

no newly created repositories trending in the last month

Last month (absolute gain)

TEN-framework/TEN-Agent

3,587 (+2,021)

apache-2.0

NexaAI/nexa-sdk

5,096 (+1,166)

apache-2.0

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

12,958 (+422)

bsd-2-clause

NVIDIA/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

12,487 (+321)

apache-2.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+296)

k2-fsa/sherpa-onnx

3,877 (+237)

apache-2.0

nyrahealth/CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

478 (+199)

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

8,337 (+196)

apache-2.0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+167)

bsd-2-clause

CheshireCC/faster-whisper-GUI

faster_whisper GUI with PySide6

1,819 (+164)

agpl-3.0

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+150)

apache-2.0

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

9,088 (+140)

apache-2.0

Henry-23/VideoChat

551 (+124)

mit

jdepoix/youtube-transcript-api

3,168 (+124)

mit

PaddlePaddle/PaddleSpeech

11,291 (+119)

apache-2.0

PeterH0323/Streamer-Sales

2,697 (+101)

agpl-3.0

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+86)

wzpan/wukong-robot

6,451 (+76)

mit

linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

2,130 (+74)

agpl-3.0

ahmetoner/whisper-asr-webservice

OpenAI Whisper ASR Webservice API

2,188 (+72)

mit

Last month (relative gain)

Alannikos/FunGPT

33 (+267%)

mit

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+140%)

apache-2.0

TEN-framework/TEN-Agent

3,587 (+129%)

apache-2.0

nyrahealth/CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

478 (+71%)

tsmdt/whisply

💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... Fast!!

25 (+47%)

apache-2.0

pavelzbornik/whisperX-FastAPI

FastAPI service on top of WhisperX

56 (+33%)

NexaAI/nexa-sdk

5,096 (+30%)

apache-2.0

Henry-23/VideoChat

551 (+29%)

mit

wwbin2017/bailing

百聆是一个类似GPT-4o的语音对话机器人，通过ASR+LLM+TTS实现，时延低至800ms，低配置也可运行，支持打断

54 (+29%)

mit

thewh1teagle/pyannote-rs

pyannote audio diarization in rust

38 (+27%)

mit

abus-aikorea/kara-audio

Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports YouTube Downloader, Vocal Remover and Transcription.

31 (+15%)

gpl-3.0

JosefAlbers/whisper-turbo-mlx

Blazing fast whisper turbo for ASR (speech-to-text) tasks

180 (+13%)

mit

vilassn/whisper_android

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

272 (+12%)

mit

litongjava/whisper-cpp-server

whisper-cpp-serve Real-time speech recognition and c+ of OpenAI's Whisper model in C/C++

54 (+10%)

mit

CheshireCC/faster-whisper-GUI

faster_whisper GUI with PySide6

1,819 (+10%)

agpl-3.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+9%)

gongouveia/Whisper-Synthetic-ASR-Dataset-Generator

This UI serves as a Synthetic ASR Dataset Generator powered by/for OpenAI Whisper, enabling users to capture audio, transcribing it, on the fly and manage the generated dataset 🤗. Fine tune Whisper ...

26 (+8%)

linto-ai/linto-stt

An automatic speech recognition API

48 (+7%)

agpl-3.0

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

913 (+7%)

agpl-3.0

k2-fsa/sherpa-onnx

3,877 (+7%)

apache-2.0

Last 12-months (new repositories)

NexaAI/nexa-sdk

5,096

apache-2.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769

TEN-framework/TEN-Agent

3,587

apache-2.0

PeterH0323/Streamer-Sales

2,697

agpl-3.0

harry0703/AudioNotes

快速提取音视频内容，整理成一份结构化的markdown笔记

1,135

mit

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

978

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

913

agpl-3.0

Henry-23/VideoChat

551

mit

nyrahealth/CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

478

metame-ai/awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

362

mit

revdotcom/reverb

Open source inference code for Rev's model

347

apache-2.0

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257

apache-2.0

ElmTran/praises

Praises is a text-to-speech tool that can help you read text easily.

234

mit

JosefAlbers/whisper-turbo-mlx

Blazing fast whisper turbo for ASR (speech-to-text) tasks

180

mit

kurianbenoy/Indic-Subtitler

Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.

gpl-2.0

quocanh34/Bud500

Bud500: A Comprehensive Vietnamese ASR Dataset

apache-2.0

QuantiusBenignus/blurt

Gnome shell extension for accurate speech to text input in Linux using whisper.cpp. Input text from speech anywhere.

gpl-3.0

wwbin2017/bailing

百聆是一个类似GPT-4o的语音对话机器人，通过ASR+LLM+TTS实现，时延低至800ms，低配置也可运行，支持打断

mit

cmeraki/audiotoken

Audio tokenization, in the fastest way possible!

apache-2.0

thewh1teagle/pyannote-rs

pyannote audio diarization in rust

mit

Last 12-months (absolute gain)

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

12,958 (+5,864)

bsd-2-clause

NexaAI/nexa-sdk

5,096 (+5,095)

apache-2.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+3,767)

NVIDIA/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

12,487 (+3,736)

apache-2.0

TEN-framework/TEN-Agent

3,587 (+3,576)

apache-2.0

k2-fsa/sherpa-onnx

3,877 (+3,485)

apache-2.0

PeterH0323/Streamer-Sales

2,697 (+2,696)

agpl-3.0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+2,496)

bsd-2-clause

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

9,088 (+2,102)

apache-2.0

PaddlePaddle/PaddleSpeech

11,291 (+1,903)

apache-2.0

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

8,337 (+1,833)

apache-2.0

CheshireCC/faster-whisper-GUI

faster_whisper GUI with PySide6

1,819 (+1,608)

agpl-3.0

jdepoix/youtube-transcript-api

3,168 (+1,237)

mit

harry0703/AudioNotes

快速提取音视频内容，整理成一份结构化的markdown笔记

1,135 (+1,134)

mit

wzpan/wukong-robot

6,451 (+1,128)

mit

ahmetoner/whisper-asr-webservice

OpenAI Whisper ASR Webservice API

2,188 (+994)

mit

linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

2,130 (+989)

agpl-3.0

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

978 (+972)

mit

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+971)

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

913 (+906)

agpl-3.0

Last 12-months (relative gain)

TEN-framework/TEN-Agent

3,587 (+32,509%)

apache-2.0

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

978 (+16,200%)

mit

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

913 (+12,943%)

agpl-3.0

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

927 (+1,915%)

apache-2.0

ElmTran/praises

Praises is a text-to-speech tool that can help you read text easily.

234 (+1,700%)

mit

k2-fsa/sherpa-onnx

3,877 (+889%)

apache-2.0

vilassn/whisper_android

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

272 (+871%)

mit

CheshireCC/faster-whisper-GUI

faster_whisper GUI with PySide6

1,819 (+762%)

agpl-3.0

shashikg/WhisperS2T

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

332 (+710%)

mit

litongjava/whisper-cpp-server

whisper-cpp-serve Real-time speech recognition and c+ of OpenAI's Whisper model in C/C++

54 (+671%)

mit

avsrma/LLM-based-AI-Assistant

A general purpose AI voice assistant built using GPT-4.

31 (+520%)

abus-aikorea/kara-audio

Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports YouTube Downloader, Vocal Remover and Transcription.

31 (+520%)

gpl-3.0

AI4Bharat/Chitralekha

Chitralekha - A video transcreation platform for Indic languages, supporting transcription, translation and voice-over

93 (+365%)

mit

jim60105/docker-whisperX

Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)

190 (+332%)

mit

cmeraki/audiotoken

Audio tokenization, in the fastest way possible!

46 (+318%)

apache-2.0

fengredrum/finetune-whisper-lora

Fine-Tune Whisper with Transformers and PEFT

41 (+310%)

mit

JosefAlbers/whisper-turbo-mlx

Blazing fast whisper turbo for ASR (speech-to-text) tasks

180 (+283%)

mit

mkiol/dsnote

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

612 (+229%)

mpl-2.0

weimeng23/speech-recognition-learning-resources

:white_check_mark: A list of speech recognition learning resources including courses, books, tutorials, papers and toolkits.

49 (+227%)

cc-by-sa-4.0

DmitryRyumin/ICASSP-2023-24-Papers

418 (+224%)

mit