Trending repositories for topic speaker-recognition
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
SincNet is a neural architecture for efficiently processing raw audio samples.
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
SincNet is a neural architecture for efficiently processing raw audio samples.
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
SincNet is a neural architecture for efficiently processing raw audio samples.
:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
an open-source implementation of sequence-to-sequence based speech processing engine
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
a Pytorch library for security research on speaker recognition, released in "Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition" accepted by TDSC
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
语音算法相关资源汇总 Resource for Speech Processing || NEWS: official link of VoxCeleb fails recently and an external link is added for download
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Deep Learning - one shot learning for speaker recognition using Filter Banks
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
SincNet is a neural architecture for efficiently processing raw audio samples.
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Deep Learning - one shot learning for speaker recognition using Filter Banks
an open-source implementation of sequence-to-sequence based speech processing engine
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Introduction to Speech Processing
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
a Pytorch library for security research on speaker recognition, released in "Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition" accepted by TDSC
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Deep Learning - one shot learning for speaker recognition using Filter Banks
OpenSpeaker is a completely independent and open source speaker recognition project. It provides the entire process of speaker recognition including multi-platform deployment and model optimization.