Attentive Statistics Pooling for Deep Speaker Embedding :: 끄적끄적

ABOUT ME

-

Today: -

Yesterday: -

Total: -

Attentive Statistics Pooling for Deep Speaker Embedding

AI 모델 2021. 1. 11. 16:39

Overview

Speaker recognition should be able to get embedding that has
- Small intra-speaker and
- Large inter-speaker distance
Evaluate most popular loss functions for speaker recognition on the VoxCeleb dataset
Propose new metric learning objective function

Higher-order pooling with attention

Statistics pooling
- Calculate mean vector [1]
- Calculate standared deviation vector over frame-level features h_t(t=1,...,T) [2]

'AI 모델' 카테고리의 다른 글

Anomaly Detection Based on Feature Reconstruction from Sub-sampled Audio Signals (0)	2021.01.18
Robust Unsupervised Video Anomaly Detection by Multi-Path Frame Prediction (0)	2021.01.12
In defence of metric learning for speaker recognition (0)	2021.01.07
Look, Listen, And Learn More: Design Choices For Deep Audio Embeddings (0)	2021.01.06
A Deep Neural Network For Unsupervised Anomaly Detection And Diagnosis In Multivariate Time Series Data (MSCRED) (0)	2021.01.05

관련글 관련글 더보기

인기포스트

ABOUT ME

LINK

ADMIN

티스토리툴바