Trending repositories for topic speaker-recognition
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
SincNet is a neural architecture for efficiently processing raw audio samples.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
SincNet is a neural architecture for efficiently processing raw audio samples.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
SincNet is a neural architecture for efficiently processing raw audio samples.
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
Deep Learning - one shot learning for speaker recognition using Filter Banks
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and evaluation software
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and evaluation software
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Deep Learning - one shot learning for speaker recognition using Filter Banks
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
OpenSpeaker is a completely independent and open source speaker recognition project. It provides the entire process of speaker recognition including multi-platform deployment and model optimization.
ICASSP 2022: 'Self-supervised Speaker Recognition with Loss-gated Learning'
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
SincNet is a neural architecture for efficiently processing raw audio samples.
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
an open-source implementation of sequence-to-sequence based speech processing engine
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Deep Learning - one shot learning for speaker recognition using Filter Banks
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN a...
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
Introduction to Speech Processing
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
a Pytorch library for security research on speaker recognition, released in "Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition" accepted by TDSC
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
OpenSpeaker is a completely independent and open source speaker recognition project. It provides the entire process of speaker recognition including multi-platform deployment and model optimization.
Deep Learning - one shot learning for speaker recognition using Filter Banks
A curated list of speaker-embedding speaker-verification, speaker-identification resources.
ICASSP 2022: 'Self-supervised Speaker Recognition with Loss-gated Learning'