Trending repositories for topic speech-to-text

Last 3 days (new repositories)

no newly created repositories trending in the last 3 days

Last 3 days (absolute gain)

ggerganov/whisper.cpp

Port of OpenAI's Whisper model in C/C++

36,447 (+92)

mit

SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

13,062 (+55)

mit

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

12,958 (+54)

bsd-2-clause

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper API，使用本地运行的Whisper模型进行推理，并支持多GPU并发，针对分布式部署进行设计。还内置了包括TikTok、抖音等社交媒体平台的爬虫，可实现来自多个社交平台的无缝媒体处理，为媒体内容数据自动化处理提供了强大且可扩展的解决方案。

257 (+52)

apache-2.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+44)

jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

11,111 (+39)

gpl-3.0

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

8,337 (+34)

apache-2.0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+25)

bsd-2-clause

abus-aikorea/voice-pro

Comprehensive Gradio WebUI for audio processing, powered by Whisper engines (Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer, zero-shot Voice Cloning (E2, F5-TTS), YouTube downlo...

2,368 (+24)

mit

jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

2,696 (+20)

gpl-3.0

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

9,088 (+17)

apache-2.0

mozilla/DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

25,498 (+15)

mpl-2.0

pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

1,726 (+14)

agpl-3.0

tmoroney/auto-subs

Generate Subtitles & Diarize Speakers in Davinci Resolve using AI.

698 (+13)

mit

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+13)

kaldi-asr/kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

14,379 (+13)

modelscope/FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

3,897 (+13)

mit

k2-fsa/sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V,...

3,877 (+11)

apache-2.0

KoljaB/RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

2,225 (+11)

mit

Uberi/speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

8,496 (+10)

bsd-3-clause

Last 3 days (relative gain)

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+25%)

apache-2.0

tsmdt/whisply

💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... Fast!!

25 (+4%)

apache-2.0

tmoroney/auto-subs

Generate Subtitles & Diarize Speakers in Davinci Resolve using AI.

698 (+2%)

mit

echogarden-project/echogarden

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice i...

253 (+1%)

gpl-3.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+1%)

abus-aikorea/voice-pro

2,368 (+1%)

mit

albirrkarim/react-speech-highlight-demo

React / Vanilla JS Text to Speech with highlighting the words and sentences that are being spoken using audio files, text to speech API, and web speech synthesis API

99 (+1%)

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+0.9%)

misyaguziya/VRCT

VRCT(VRChat Chatbox Translator & Transcription)

115 (+0.9%)

mit

pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

1,726 (+0.8%)

agpl-3.0

jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

2,696 (+0.7%)

gpl-3.0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+0.6%)

bsd-2-clause

AlekPet/ComfyUI_Custom_Nodes_AlekPet

Custom nodes that extend the capabilities of Comfyui

947 (+0.5%)

mit

savbell/whisper-writer

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.

386 (+0.5%)

gpl-3.0

SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

13,062 (+0.4%)

mit

jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

11,111 (+0.4%)

gpl-3.0

locaal-ai/obs-localvocal

OBS plugin for local speech recognition and captioning using AI

586 (+0.3%)

gpl-2.0

VRCWizard/TTS-Voice-Wizard

Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)

612 (+0.3%)

mit

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

927 (+0.3%)

apache-2.0

bugbakery/transcribee

open source audio and video transcription software

314 (+0.3%)

agpl-3.0

Last week (new repositories)

no newly created repositories trending in the last week

Last week (absolute gain)

ggerganov/whisper.cpp

Port of OpenAI's Whisper model in C/C++

36,447 (+197)

mit

SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

13,062 (+118)

mit

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+106)

apache-2.0

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

12,958 (+98)

bsd-2-clause

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+83)

jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

11,111 (+74)

gpl-3.0

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

8,337 (+57)

apache-2.0

k2-fsa/sherpa-onnx

3,877 (+48)

apache-2.0

abus-aikorea/voice-pro

2,368 (+43)

mit

jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

2,696 (+39)

gpl-3.0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+39)

bsd-2-clause

modelscope/FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

3,897 (+34)

mit

KoljaB/RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

2,225 (+32)

mit

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

9,088 (+31)

apache-2.0

pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

1,726 (+27)

agpl-3.0

tmoroney/auto-subs

Generate Subtitles & Diarize Speakers in Davinci Resolve using AI.

698 (+23)

mit

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+23)

mozilla/DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

25,498 (+18)

mpl-2.0

huggingface/speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

3,626 (+14)

apache-2.0

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

927 (+13)

apache-2.0

Last week (relative gain)

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+70%)

apache-2.0

tsmdt/whisply

💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... Fast!!

25 (+9%)

apache-2.0

pavelzbornik/whisperX-FastAPI

FastAPI service on top of WhisperX

56 (+8%)

echogarden-project/echogarden

253 (+5%)

gpl-3.0

tmoroney/auto-subs

Generate Subtitles & Diarize Speakers in Davinci Resolve using AI.

698 (+3%)

mit

patrickenfuego/Chapterize-Audiobooks

Split a single, monolithic mp3 audiobook file into chapters using Machine Learning and ffmpeg.

108 (+3%)

apache-2.0

JosefAlbers/whisper-turbo-mlx

Blazing fast whisper turbo for ASR (speech-to-text) tasks

180 (+3%)

mit

inboxpraveen/LLM-Minutes-of-Meeting

🎤📄 An innovative tool that transforms audio or video files into text transcripts and generates concise meeting minutes. Stay organized and efficient in your meetings, and get ready for Phase 2 where...

116 (+3%)

mit

SamirPaulb/real-time-voice-translator

A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.

211 (+2%)

gpl-2.0

Thiagohgl/ai-pronunciation-trainer

This tool uses AI to evaluate your pronunciation.

171 (+2%)

agpl-3.0

savbell/whisper-writer

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.

386 (+2%)

gpl-3.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+2%)

ChetanXpro/nodejs-whisper

NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.

95 (+2%)

mit

j3soon/whisper-to-input

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.

51 (+2%)

bugbakery/transcribee

open source audio and video transcription software

314 (+2%)

agpl-3.0

litongjava/whisper-cpp-server

whisper-cpp-serve Real-time speech recognition and c+ of OpenAI's Whisper model in C/C++

54 (+2%)

mit

aniemore/Aniemore

Emotions recognition from audio and text files (only russian language)

57 (+2%)

mit

misyaguziya/VRCT

VRCT(VRChat Chatbox Translator & Transcription)

115 (+2%)

mit

Pikurrot/whisper-gui

A simple GUI to use Whisper.

116 (+2%)

mit

hanifabd/voice-activity-detection-vad-realtime

Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)

58 (+2%)

Last month (new repositories)

no newly created repositories trending in the last month

Last month (absolute gain)

abus-aikorea/voice-pro

2,368 (+1,489)

mit

ggerganov/whisper.cpp

Port of OpenAI's Whisper model in C/C++

36,447 (+687)

mit

SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

13,062 (+505)

mit

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

12,958 (+422)

bsd-2-clause

jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

11,111 (+329)

gpl-3.0

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+296)

k2-fsa/sherpa-onnx

3,877 (+237)

apache-2.0

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

8,337 (+196)

apache-2.0

jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

2,696 (+177)

gpl-3.0

modelscope/FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

3,897 (+169)

mit

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+167)

bsd-2-clause

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+150)

apache-2.0

KoljaB/RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

2,225 (+140)

mit

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

9,088 (+140)

apache-2.0

mozilla/DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

25,498 (+121)

mpl-2.0

leon-ai/leon

🧠 Leon is your open-source personal assistant.

15,572 (+101)

mit

ictnlp/LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

2,670 (+95)

apache-2.0

pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

1,726 (+93)

agpl-3.0

Purfview/whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

1,418 (+86)

tmoroney/auto-subs

Generate Subtitles & Diarize Speakers in Davinci Resolve using AI.

698 (+85)

mit

Last month (relative gain)

abus-aikorea/voice-pro

2,368 (+169%)

mit

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257 (+140%)

apache-2.0

tsmdt/whisply

💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... Fast!!

25 (+47%)

apache-2.0

pavelzbornik/whisperX-FastAPI

FastAPI service on top of WhisperX

56 (+33%)

echogarden-project/echogarden

253 (+27%)

gpl-3.0

innovatorved/realtime-interview-copilot

Realtime Interview Copilot is a web application that assists users in crafting responses during interviews. It leverages real-time audio transcription and AI-powered response generation to provide rel...

41 (+17%)

j3soon/whisper-to-input

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.

51 (+16%)

abus-aikorea/kara-audio

Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports YouTube Downloader, Vocal Remover and Transcription.

31 (+15%)

gpl-3.0

tmoroney/auto-subs

Generate Subtitles & Diarize Speakers in Davinci Resolve using AI.

698 (+14%)

mit

misyaguziya/VRCT

VRCT(VRChat Chatbox Translator & Transcription)

115 (+13%)

mit

Pikurrot/whisper-gui

A simple GUI to use Whisper.

116 (+13%)

mit

Thiagohgl/ai-pronunciation-trainer

This tool uses AI to evaluate your pronunciation.

171 (+13%)

agpl-3.0

JosefAlbers/whisper-turbo-mlx

Blazing fast whisper turbo for ASR (speech-to-text) tasks

180 (+13%)

mit

inboxpraveen/LLM-Minutes-of-Meeting

116 (+12%)

mit

patrickenfuego/Chapterize-Audiobooks

Split a single, monolithic mp3 audiobook file into chapters using Machine Learning and ffmpeg.

108 (+11%)

apache-2.0

morioka/tiny-openai-whisper-api

OpenAI Whisper API-style local server, runnig on FastAPI

70 (+11%)

mit

AlexisBalayre/AI-Powered-Meeting-Summarizer

Gradio-powered application that converts audio recordings of meetings into transcripts and provides concise summaries using whisper.

72 (+11%)

mit

litongjava/whisper-cpp-server

whisper-cpp-serve Real-time speech recognition and c+ of OpenAI's Whisper model in C/C++

54 (+10%)

mit

JigsawStack/insanely-fast-whisper-api

An API to transcribe audio with OpenAI's Whisper Large v3!

211 (+9%)

mit

MooreThreads/MooER

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not lim...

178 (+9%)

Last 12-months (new repositories)

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769

huggingface/speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

3,626

apache-2.0

jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

2,696

gpl-3.0

ictnlp/LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

2,670

apache-2.0

abus-aikorea/voice-pro

2,368

mit

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

978

mit

alesaccoia/VoiceStreamAI

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

761

mit

mezbaul-h/june

Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit

729

mit

soupslurpr/Transcribro

Private and on-device speech recognition keyboard and service for Android.

473

isc

revdotcom/reverb

Open source inference code for Rev's model

347

apache-2.0

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

257

apache-2.0

davidmartinrius/speech-dataset-generator

🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.

213

mit

JigsawStack/insanely-fast-whisper-api

An API to transcribe audio with OpenAI's Whisper Large v3!

211

mit

JosefAlbers/whisper-turbo-mlx

Blazing fast whisper turbo for ASR (speech-to-text) tasks

180

mit

MooreThreads/MooER

178

gustavostz/whisper-clip

WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words...

116

mit

aladinyo/ChatPlus

ChatPlus is a progressive web app developped with React, NodeJS, Firebase and other services

106

mit

viddotech/videoalchemy

VideoAlchemy is a toolkit expanding video processing capabilities, emphasizing FFmpeg and broader video technology applications.

mit

kurianbenoy/Indic-Subtitler

Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.

gpl-2.0

jonaskahn/asktube

AskTube - An AI-powered YouTube video summarizer and QA assistant powered by Retrieval Augmented Generation (RAG) 🤖. Run it entirely on your local machine with Ollama, or cloud-based models like Clau...

mit

Last 12-months (absolute gain)

ggerganov/whisper.cpp

Port of OpenAI's Whisper model in C/C++

36,447 (+9,979)

mit

jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

11,111 (+8,685)

gpl-3.0

SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

13,062 (+6,792)

mit

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

12,958 (+5,864)

bsd-2-clause

modelscope/FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

3,897 (+3,785)

mit

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

3,769 (+3,767)

huggingface/speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

3,626 (+3,625)

apache-2.0

k2-fsa/sherpa-onnx

3,877 (+3,485)

apache-2.0

jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

2,696 (+2,687)

gpl-3.0

ictnlp/LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

2,670 (+2,666)

apache-2.0

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

3,901 (+2,496)

bsd-2-clause

abus-aikorea/voice-pro

2,368 (+2,367)

mit

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

9,088 (+2,102)

apache-2.0

mozilla/DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

25,498 (+1,934)

mpl-2.0

KoljaB/RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

2,225 (+1,852)

mit

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

8,337 (+1,833)

apache-2.0

leon-ai/leon

🧠 Leon is your open-source personal assistant.

15,572 (+1,630)

mit

pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

1,726 (+1,325)

agpl-3.0

bugbakery/audapolis

an editor for spoken-word audio with automatic transcription

1,703 (+1,105)

agpl-3.0

kaldi-asr/kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

14,379 (+999)

Last 12-months (relative gain)

ictnlp/LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

2,670 (+66,650%)

apache-2.0

jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

2,696 (+29,856%)

gpl-3.0

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

978 (+16,200%)

mit

soupslurpr/Transcribro

Private and on-device speech recognition keyboard and service for Android.

473 (+11,725%)

isc

modelscope/FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

3,897 (+3,379%)

mit

alesaccoia/VoiceStreamAI

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

761 (+3,071%)

mit

deepgram-starters/nextjs-live-transcription

Live transcription in Next.js by Deepgram

152 (+2,433%)

mit

tmoroney/auto-subs

Generate Subtitles & Diarize Speakers in Davinci Resolve using AI.

698 (+2,393%)

mit

JigsawStack/insanely-fast-whisper-api

An API to transcribe audio with OpenAI's Whisper Large v3!

211 (+2,010%)

mit

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

927 (+1,915%)

apache-2.0

paulovcmedeiros/pyRobBot

Chat with GPT LLMs over voice, UI & terminal, all with access to the internet. Powered by OpenAI.

118 (+1,586%)

mit

AlexisBalayre/AI-Powered-Meeting-Summarizer

Gradio-powered application that converts audio recordings of meetings into transcripts and provides concise summaries using whisper.

72 (+1,340%)

mit

hugobloem/wyoming-microsoft-stt

Wyoming protocol server for Microsoft Azure speech-to-text

42 (+950%)

j3soon/whisper-to-input

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.

51 (+920%)

k2-fsa/sherpa-onnx

3,877 (+889%)

apache-2.0

inboxpraveen/LLM-Minutes-of-Meeting

116 (+867%)

mit

misyaguziya/VRCT

VRCT(VRChat Chatbox Translator & Transcription)

115 (+858%)

mit

JSchmie/ScrAIbe

Tool for automatic transcription and speaker diarization based on whisper and pyannote.

35 (+775%)

gpl-3.0

smalltong02/keras-llm-robot

A web UI Project In order to learn the large language model. This project includes features such as chat, quantization, fine-tuning, prompt engineering templates, and multimodality.

229 (+748%)

apache-2.0

SamirPaulb/real-time-voice-translator

A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.

211 (+744%)

gpl-2.0