#### DMCA

## SPEAKER VERIFICATIONWITH THE MIXTURE OF GAUSSIAN FACTOR ANALYSIS BASED REPRESENTATION

### Citations

296 | Front-end factor analysis for speaker verification,”
- Dehak, Kenny, et al.
- 2011
(Show Context)
Citation Context ...tor modeling has gained significant attention in both speaker verification (SV) and language identification (LID) domains due to its excellent performance, compact representation and small model size =-=[1, 2, 3]-=-. In this modeling, first, zero-order and first-order Baum-Welch statistics are calculated by projecting the MFCC features on those Gaussian Mixture Model (GMM) components using the occupancy posterio... |

182 | Support vector machines using GMM supervectors for speaker verifi cation
- Campbell, Sturim, et al.
- 2006
(Show Context)
Citation Context ...d, within this i-vector space, variability compensation methods, such as Within-Class Covariance Normalization (WCCN) [4], Linear Discriminative Analysis (LDA) and Nuisance Attribute Projection (NAP) =-=[5]-=-, are performed to reduce the variability for the subsequent modeling methods (e.g., Support Vector Machine [6], Sparse Representation [7], Probabilistic Linear Discriminant Analysis (PLDA) [8, 9, 10]... |

157 | SVM based speaker verification using a GMM supervector kernel and NAP variability compensation
- Campbell, Sturim, et al.
(Show Context)
Citation Context ...(WCCN) [4], Linear Discriminative Analysis (LDA) and Nuisance Attribute Projection (NAP) [5], are performed to reduce the variability for the subsequent modeling methods (e.g., Support Vector Machine =-=[6]-=-, Sparse Representation [7], Probabilistic Linear Discriminant Analysis (PLDA) [8, 9, 10], etc.). Conventionally, in the i-vector framework, the tokens for calculating the zero-order and first-order B... |

116 | Probabilistic linear discriminant analysis for inferences about identity,”
- Prince
- 2007
(Show Context)
Citation Context ... (NAP) [5], are performed to reduce the variability for the subsequent modeling methods (e.g., Support Vector Machine [6], Sparse Representation [7], Probabilistic Linear Discriminant Analysis (PLDA) =-=[8, 9, 10]-=-, etc.). Conventionally, in the i-vector framework, the tokens for calculating the zero-order and first-order Baum-Welch statistics are the MFCC features trained GMM components. Such choice of token u... |

114 |
Eigenvoice modeling with sparse training data,”
- Kenny, Boulianne, et al.
- 2005
(Show Context)
Citation Context ...ctor [2]. Considering aC-component GMM and D dimensional acoustic features, the total variability matrix T is a CD ×K matrix which can be estimated the same way as learning the eigenvoice matrix V in =-=[18]-=- except that here we consider that every utterance is produced by a new speaker [2]. Given the centered mean supervector F̃ and total variability matrix T, the i-vector is computed as follows [2]: x =... |

109 | Joint factor analysis versus eigenchannels in speaker recognition
- Kenny, Boulianne, et al.
- 2007
(Show Context)
Citation Context ...aseline In the total variability space, there is no distinction between the speaker effects and the channel effects. Rather than separately using the eigenvoice matrix V and the eigenchannel matrix U =-=[17]-=-, the total variability space simultaneously captures the speaker and channel variabilities [2]. Given a C component GMM UBM model λ with λc = {pc, µc,Σc}, c = 1, · · · , C and an utterance with a L f... |

79 | Within-class covariance normalization for SVM-based speaker recognition
- Hatch, Kajarekar, et al.
(Show Context)
Citation Context ... jointly models language, speaker and channel variabilities all together [1]. Third, within this i-vector space, variability compensation methods, such as Within-Class Covariance Normalization (WCCN) =-=[4]-=-, Linear Discriminative Analysis (LDA) and Nuisance Attribute Projection (NAP) [5], are performed to reduce the variability for the subsequent modeling methods (e.g., Support Vector Machine [6], Spars... |

72 | Hierarchical structures of neural networks for phoneme recognition
- Schwarz, Matejka, et al.
(Show Context)
Citation Context ...rted into a sequence of 36-dimensional feature vectors, each consisting of 18 MFCC coefficients and their first derivatives. For phonetic feature extraction, we employed an English phoneme recognizer =-=[21]-=- to perform the voice activity detection (VAD) and output the frame level monophone states posterior probability. After log, PCA and MVN, the resulted 52 dimensional tandem features are fused with MFC... |

35 | 2011b. Language Recognition via i-vectors and Dimensionality Reduction
- Dehak, Torres-Carrasquillo, et al.
(Show Context)
Citation Context ...tor modeling has gained significant attention in both speaker verification (SV) and language identification (LID) domains due to its excellent performance, compact representation and small model size =-=[1, 2, 3]-=-. In this modeling, first, zero-order and first-order Baum-Welch statistics are calculated by projecting the MFCC features on those Gaussian Mixture Model (GMM) components using the occupancy posterio... |

24 |
Full-Covariance UBM and Heavy-Tailed PLDA in I-Vector Speaker Verification
- Matejka, Glembek, et al.
- 2011
(Show Context)
Citation Context ... (NAP) [5], are performed to reduce the variability for the subsequent modeling methods (e.g., Support Vector Machine [6], Sparse Representation [7], Probabilistic Linear Discriminant Analysis (PLDA) =-=[8, 9, 10]-=-, etc.). Conventionally, in the i-vector framework, the tokens for calculating the zero-order and first-order Baum-Welch statistics are the MFCC features trained GMM components. Such choice of token u... |

12 |
A novel scheme for speaker recognition using a phoneticallyaware deep neural network,” in ICASSP-2014.
- Lei, Scheffer, et al.
- 2007
(Show Context)
Citation Context ...zero-order and first-order Baum-Welch statistics are the MFCC features trained GMM components. Such choice of token units may not be the optimal solution. Recently, the generalized i-vector framework =-=[11, 12, 13, 14, 15]-=- has been proposed. In this framework, the tokens for calculating the zero-order statistics have This research is supported in part by the National Natural Science Foundation of China No. 61401524, Na... |

10 |
Simplified supervised i-vector modeling with application to robust and efficient language identification and speaker verification,” Computer Speech and Language
- Li, Narayanan
- 2014
(Show Context)
Citation Context ...nce’s i-vector xj. The residual that cannot be represented by T is described as a single gaussian variable ij . F̃ij = Tixj + ij (9) The corresponding generative model is defined the same way as in =-=[19]-=-, where Ni,j denotes the corresponding zero-order statistics: P (xj) = N (0, I), P (F̃ij |xj) = N (Tixj, σi 2 Nij ) (10) In the Mixture of Gaussian factor analysis, we apply a mixture of Gaussians wit... |

9 |
H.: Shifteddelta mlp features for spoken language recognition
- Wang, Leung, et al.
(Show Context)
Citation Context ...zero-order and first-order Baum-Welch statistics are the MFCC features trained GMM components. Such choice of token units may not be the optimal solution. Recently, the generalized i-vector framework =-=[11, 12, 13, 14, 15]-=- has been proposed. In this framework, the tokens for calculating the zero-order statistics have This research is supported in part by the National Natural Science Foundation of China No. 61401524, Na... |

8 | Speaker verification using sparse representations on total variability i-vectors
- Li, Zhang, et al.
- 2011
(Show Context)
Citation Context ...native Analysis (LDA) and Nuisance Attribute Projection (NAP) [5], are performed to reduce the variability for the subsequent modeling methods (e.g., Support Vector Machine [6], Sparse Representation =-=[7]-=-, Probabilistic Linear Discriminant Analysis (PLDA) [8, 9, 10], etc.). Conventionally, in the i-vector framework, the tokens for calculating the zero-order and first-order Baum-Welch statistics are th... |

7 |
Garcia-Romero and Carol Y Espy-Wilson, “Analysis of i-vector length normalization in speaker recognition systems
- Daniel
- 2011
(Show Context)
Citation Context ... (NAP) [5], are performed to reduce the variability for the subsequent modeling methods (e.g., Support Vector Machine [6], Sparse Representation [7], Probabilistic Linear Discriminant Analysis (PLDA) =-=[8, 9, 10]-=-, etc.). Conventionally, in the i-vector framework, the tokens for calculating the zero-order and first-order Baum-Welch statistics are the MFCC features trained GMM components. Such choice of token u... |

7 |
Text-dependent speaker verification: Classifiers, databases and RSR2015
- Larcher, Lee, et al.
- 2014
(Show Context)
Citation Context ...ker Target Imposter MoG K=2 MoG K=1 Text T F T F extract method 2 baseline Trials tar non - - 0.24%/0.01/0.1 0.41%/0.02/0.1 tar - non - 3.97%/0.195/0.6 4.88%/0.239/0.7 tar - - non 0.05%/0/0 0.11%/0/0 =-=[22]-=-. RSR 2015 is indeed a short duration database which is suitable to test the performance in the short duration scenario. In the RSR2015 database, the number of speakers in the background, development ... |

6 | J.D.: Extended phone log-likelihood ratio features and acoustic-based i-vectors for language recognition - D’Haro, Cordoba, et al. |

4 | Deep neural networks for extracting baum-welch statistics for speaker recognition
- Kenny, Gupta, et al.
- 2014
(Show Context)
Citation Context ...zero-order and first-order Baum-Welch statistics are the MFCC features trained GMM components. Such choice of token units may not be the optimal solution. Recently, the generalized i-vector framework =-=[11, 12, 13, 14, 15]-=- has been proposed. In this framework, the tokens for calculating the zero-order statistics have This research is supported in part by the National Natural Science Foundation of China No. 61401524, Na... |

4 | Speaker verification and spoken language identification using a generalized i-vector framework with phonetic tokenization and tandem features
- Li, Liu
- 2014
(Show Context)
Citation Context |

1 | Robust principal component analysis with complex noise
- Zhao, Meng, et al.
- 2014
(Show Context)
Citation Context ...d. Therefore, we propose a generalized factor analysis framework by replacing the single Gaussian with a mixture of Gaussians to better model the residual noises. This idea was originally proposed in =-=[16]-=- to fit the complex residual energy in the robust principal component analysis (PCA) framework. In this work, we extend the MoG residual noise fitting method [16] from PCA to factor analysis. The MoG ... |