Statistics for topic speech-to-text
RepositoryStats tracks 518,325 Github repositories, of these 295 are tagged with the speech-to-text topic. The most common primary language for repositories using this topic is Python (135). Other languages include: JavaScript (29), TypeScript (21), Jupyter Notebook (20), C++ (12)
Stargazers over time for topic speech-to-text
Most starred repositories for topic speech-to-text (view more)
Trending repositories for topic speech-to-text (view more)
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
A web UI Project In order to learn the large language model. This project includes features such as chat, quantization, fine-tuning, prompt engineering templates, and multimodality.
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers,...
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.
VietGPT VoiceBot: Chatbot automatically recognizes Vietnamese voice and uses the ChatGPT API for natural language interaction.
A web UI Project In order to learn the large language model. This project includes features such as chat, quantization, fine-tuning, prompt engineering templates, and multimodality.
Generate subtitles using OpenAI Whisper in Davinci Resolve editing software.
VietGPT VoiceBot: Chatbot automatically recognizes Vietnamese voice and uses the ChatGPT API for natural language interaction.
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
VietGPT VoiceBot: Chatbot automatically recognizes Vietnamese voice and uses the ChatGPT API for natural language interaction.
Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.
A desktop application that transcribes audio from files, microphone input or YouTube videos with the option to translate the content and create subtitles.
Generate subtitles using OpenAI Whisper in Davinci Resolve editing software.
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务,输出json、srt字幕带时间戳、纯文字格式
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务,输出json、srt字幕带时间戳、纯文字格式
Transcribe and translate voice into LRC file using Whisper and LLMs (GPT, Claude, et,al). 使用whisper和LLM(GPT,Claude等)来转录、翻译你的音频为字幕文件。
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.