Statistics for topic speech
RepositoryStats tracks 594,991 Github repositories, of these 308 are tagged with the speech topic. The most common primary language for repositories using this topic is Python (181). Other languages include: Jupyter Notebook (29), JavaScript (15)
Stargazers over time for topic speech
Most starred repositories for topic speech (view more)
Trending repositories for topic speech (view more)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
更适合新手的 AI 口语对话练习应用 / Beginner-friendly AI conversation practice application
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
更适合新手的 AI 口语对话练习应用 / Beginner-friendly AI conversation practice application
Free, high quality text-to-speech for your Obsidian notes, leveraging Microsoft Edge's Read Aloud API.
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice i...
SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)
更适合新手的 AI 口语对话练习应用 / Beginner-friendly AI conversation practice application
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
更适合新手的 AI 口语对话练习应用 / Beginner-friendly AI conversation practice application
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice i...
Free, high quality text-to-speech for your Obsidian notes, leveraging Microsoft Edge's Read Aloud API.
SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
更适合新手的 AI 口语对话练习应用 / Beginner-friendly AI conversation practice application
Free, high quality text-to-speech for your Obsidian notes, leveraging Microsoft Edge's Read Aloud API.
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Foundational model for human-like, expressive TTS
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Foundational model for human-like, expressive TTS