23 results found Sort:

2.6k
12.6k
apache-2.0
210
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Created 2019-08-05
7,732 commits to main branch, last one 4 days ago
1.9k
11.3k
apache-2.0
185
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation a...
Created 2017-11-14
4,820 commits to develop branch, last one 3 days ago
389
3.6k
apache-2.0
45
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Created 2024-08-07
220 commits to main branch, last one 25 days ago
119
1.2k
mit
24
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Created 2022-02-08
242 commits to main branch, last one 8 months ago
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Created 2024-06-04
25 commits to main branch, last one 4 months ago
Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation.
Created 2022-03-21
52 commits to main branch, last one 6 months ago
A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
Created 2022-10-31
263 commits to master branch, last one 11 months ago
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice i...
Created 2023-04-20
802 commits to main branch, last one 6 days ago
Tracking the progress in end-to-end speech translation
Created 2020-03-02
77 commits to main branch, last one about a year ago
12
179
other
8
MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not lim...
Created 2024-08-12
53 commits to master branch, last one 12 days ago
code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)
Created 2022-04-28
7 commits to main branch, last one 2 years ago
5
61
unknown
4
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
Created 2023-10-07
22 commits to main branch, last one 5 months ago
Repository containing the open source code of works published at the FBK MT unit.
Created 2022-04-02
1,935 commits to master branch, last one 6 months ago
4
37
mit
6
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation
Created 2022-02-09
19 commits to main branch, last one about a year ago
List of direct speech-to-speech translation papers.
Created 2022-05-20
4 commits to master branch, last one about a year ago
7
36
mit
2
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".
Created 2022-03-15
7 commits to main branch, last one about a year ago
2
33
mit
3
Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"
Created 2023-05-22
16 commits to main branch, last one about a year ago
A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.
Created 2022-02-11
14 commits to main branch, last one 10 months ago
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"
Created 2023-01-31
53 commits to main branch, last one about a year ago