Trending repositories for topic speaker-recognition
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
SincNet is a neural architecture for efficiently processing raw audio samples.
Deep Learning - one shot learning for speaker recognition using Filter Banks
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Final project for the Speaker Recognition course on Udemy, 机器之心, 深蓝学院 and 语音之家
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Deep Learning - one shot learning for speaker recognition using Filter Banks
Final project for the Speaker Recognition course on Udemy, 机器之心, 深蓝学院 and 语音之家
A curated list of speaker-embedding speaker-verification, speaker-identification resources.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
SincNet is a neural architecture for efficiently processing raw audio samples.
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
SincNet is a neural architecture for efficiently processing raw audio samples.
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
Deep Learning - one shot learning for speaker recognition using Filter Banks
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
an open-source implementation of sequence-to-sequence based speech processing engine
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Introduction to Speech Processing
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
a Pytorch library for security research on speaker recognition, released in "Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition" accepted by TDSC
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the sam...
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Deep Learning - one shot learning for speaker recognition using Filter Banks
A curated list of speaker-embedding speaker-verification, speaker-identification resources.