23 results found Sort:

2.5k
12.2k
apache-2.0
207
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Created 2019-08-05
7,534 commits to main branch, last one 16 hours ago
1.9k
11.2k
apache-2.0
184
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation a...
Created 2017-11-14
4,784 commits to develop branch, last one a day ago
370
3.5k
apache-2.0
44
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Created 2024-08-07
200 commits to main branch, last one 28 days ago
114
1.2k
mit
23
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Created 2022-02-08
242 commits to main branch, last one 7 months ago
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Created 2024-06-04
25 commits to main branch, last one 2 months ago
Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation.
Created 2022-03-21
52 commits to main branch, last one 5 months ago
A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
Created 2022-10-31
263 commits to master branch, last one 10 months ago
Tracking the progress in end-to-end speech translation
Created 2020-03-02
77 commits to main branch, last one about a year ago
Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.
Created 2023-04-20
771 commits to main branch, last one 17 hours ago
12
162
other
9
MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not lim...
Created 2024-08-12
50 commits to master branch, last one 15 days ago
code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)
Created 2022-04-28
7 commits to main branch, last one 2 years ago
5
60
unknown
4
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
Created 2023-10-07
22 commits to main branch, last one 4 months ago
Repository containing the open source code of works published at the FBK MT unit.
Created 2022-04-02
1,935 commits to master branch, last one 4 months ago
4
37
mit
6
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation
Created 2022-02-09
19 commits to main branch, last one about a year ago
7
36
mit
2
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".
Created 2022-03-15
7 commits to main branch, last one about a year ago
2
33
mit
3
Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"
Created 2023-05-22
16 commits to main branch, last one 11 months ago
A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.
Created 2022-02-11
14 commits to main branch, last one 9 months ago
List of direct speech-to-speech translation papers.
Created 2022-05-20
4 commits to master branch, last one about a year ago
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"
Created 2023-01-31
53 commits to main branch, last one about a year ago