Results 1 
7 of
7
Bayesian adaptive inference and adaptive training
 IEEE Transactions Speech and Audio Processing
, 2007
"... Abstract—Largevocabulary speech recognition systems are often built using found data, such as broadcast news. In contrast to carefully collected data, found data normally contains multiple acoustic conditions, such as speaker or environmental noise. Adaptive training is a powerful approach to build ..."
Abstract

Cited by 9 (7 self)
 Add to MetaCart
(Show Context)
Abstract—Largevocabulary speech recognition systems are often built using found data, such as broadcast news. In contrast to carefully collected data, found data normally contains multiple acoustic conditions, such as speaker or environmental noise. Adaptive training is a powerful approach to build systems on such data. Here, transforms are used to represent the different acoustic conditions, and then a canonical model is trained given this set of transforms. This paper describes a Bayesian framework for adaptive training and inference. This framework addresses some limitations of standard maximumlikelihood approaches. In contrast to the standard approach, the adaptively trained system can be directly used in unsupervised inference, rather than having to rely on initial hypotheses being present. In addition, for limited adaptation data, robust recognition performance can be obtained. The limited data problem often occurs in testing as there is no control over the amount of the adaptation data available. In contrast, for adaptive training, it is possible to control the system complexity to reflect the available data. Thus, the standard point estimates may be used. As the integral associated with Bayesian adaptive inference is intractable, various marginalization approximations are described, including a variational Bayes approximation. Both batch and incremental modes of adaptive inference are discussed. These approaches are applied to adaptive training of maximumlikelihood linear regression and evaluated on a largevocabulary speech recognition task. Bayesian adaptive inference is shown to significantly outperform standard approaches. Index Terms—Adaptive training, Bayesian adaptation, Bayesian inference, incremental, variational Bayes.
What is the Best Type of Prior Distribution for EMAP Speaker Adaptation?
"... Centre de recherche informatique de Montréal ..."
(Show Context)
ISCA Archive Interspeaker correlations, intraspeaker correlations and Bayesian adaptation
"... Centre de recherche informatique de Montréal ..."
Bayesian Adaptive Inference and Adaptive Training Abstract — Large
"... vocabulary speech recognition systems are often built using found data, such as broadcast news. In contrast to carefully collected data, found data normally contains multiple acoustic conditions, such as speaker or environmental noise. Adaptive training is a powerful approach to build systems on suc ..."
Abstract
 Add to MetaCart
(Show Context)
vocabulary speech recognition systems are often built using found data, such as broadcast news. In contrast to carefully collected data, found data normally contains multiple acoustic conditions, such as speaker or environmental noise. Adaptive training is a powerful approach to build systems on such data. Here transforms are used to represent the different acoustic conditions and then a canonical model is trained given this set of transforms. This paper describes a Bayesian framework for adaptive training and inference. This framework addresses some limitations of standard ML approaches. In contrast to the standard approach, the adaptively trained system can be directly used in unsupervised inference, rather than having to rely on initial hypotheses being present. In addition, for limited adaptation data, robust recognition performance can be obtained. The limited data problem often occurs in testing as there is no control over the amount of the adaptation data available. In contrast, for adaptive training, it is possible to control the system complexity to reflect the available data. Thus, the standard point estimates may be used. As the integral associated with Bayesian adaptive inference is intractable, various marginalisation approximations are described, including a variational Bayes approximation. Both batch and incremental modes of adaptive inference are discussed. These approaches are applied to adaptive training of maximum likelihood linear regression and evaluated on a large vocabulary speech recognition task. Bayesian adaptive inference is shown to significantly outperform standard approaches.
BAYESIAN ADAPTATION AND ADAPTIVELY TRAINED SYSTEMS
"... As the use of found data increases, more systems are being built using adaptive training. Here transforms are used to represent unwanted acoustic variability, e.g. speaker and acoustic environment changes, allowing a canonical model that models only the “pure ” variability of speech to be trained. A ..."
Abstract
 Add to MetaCart
(Show Context)
As the use of found data increases, more systems are being built using adaptive training. Here transforms are used to represent unwanted acoustic variability, e.g. speaker and acoustic environment changes, allowing a canonical model that models only the “pure ” variability of speech to be trained. Adaptive training may be described within a Bayesian framework. By using complexity control approaches to ensure robust parameter estimates, the standard point estimate adaptive training can be justified within this Bayesian framework. However during recognition there is usually no control over the amount of data available. It is therefore preferable to be able to use a full Bayesian approach to applying transforms during recognition rather than the standard point estimates. This paper discusses various approximations to Bayesian approaches including a new variational Bayes approximation. The application of these approaches to stateoftheart adaptively trained systems using both CAT and MLLR transforms is then described and evaluated on a large vocabulary speech recognition task. 1.
SPEAKER ADAPTATION USING INTERSPEAKER AND INTRASPEAKER CORRELATIONS
"... The problem of speaker adaptation as it is usually formulated consists in using a minimal amount of adaptation data to estimate speakerdependent centroids or mean vectors or, equivalently, speakeroffsets (where a speakeroffset is the difference between a speakerdependent mean vector and the corr ..."
Abstract
 Add to MetaCart
(Show Context)
The problem of speaker adaptation as it is usually formulated consists in using a minimal amount of adaptation data to estimate speakerdependent centroids or mean vectors or, equivalently, speakeroffsets (where a speakeroffset is the difference between a speakerdependent mean vector and the corresponding speakerindependent mean vector). Owing to data insufficiency, maximum likelihood estimation is inadequate for this purpose and classical Bayesian estimation suffers from the drawback that it only provides estimates for centroids which are observed in the adaptation data (typically a very small fraction of the total). Accordingly many authors have investigated the use of correlations between different offsets as prior information for the purpose of MAP estimation of speakeroffsets. The standard assumption here [1, 6, 7, 8] is that the correlations between speakeroffsets for different mixture
IEEE TRANS. SPEECH AUDIO PROCESSING 1 Speaker Adaptation Using an Eigenphone Basis
"... Abstract — We describe a new method of estimating speakerdependent HMM’s for speakers in a closed population. Our method differs from previous approaches in that it is based on an explicit model of the correlations between all of the speakers in the population, the idea being that if there is not e ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — We describe a new method of estimating speakerdependent HMM’s for speakers in a closed population. Our method differs from previous approaches in that it is based on an explicit model of the correlations between all of the speakers in the population, the idea being that if there is not enough data to estimate a Gaussian mean vector for a given speaker then data from other speakers can be used provided that we know how the speakers are correlated with each other. We explain how to estimate interspeaker correlations using a KullbackLeibler divergence minimization technique which can be applied to the problem of estimating the parameters of all of the hyperdistributions that are currently used in Bayesian speaker adaptation. EDICS Category: 1RECO I.