Download:
by G. Zweig, J. Bilmes, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, E. Holtz, J. Torres, B. Byrne
Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing
http://www.sls.csail.mit.edu/sls/publications/2002/3241.pdf
Add To MetaCart
Abstract:
In recent years there has been growing interest in discriminative parameter training techniques, resulting from notable improvements in speech recognition performance on tasks ranging in size from digit recognition to Switchboard. Typified by Maximum Mutual Information training, these methods assume a fixed statistical modeling structure, and then optimize only the associated numerical parameters (such as means, variances, and transition matrices). In this paper, we explore the significantly different methodology of discriminative structure learning. Here, the fundamental dependency relationships between random variables in a probabilistic model are learned in a discriminative fashion, and are learned separately from the numerical parameters. In order to apply the principles of structural discriminability, we adopt the framework of graphical models, which allows an arbitrary set of variables with arbitrary conditional independence relationships to be modeled at each time frame. We present results using a new graphical modeling toolkit (described in a companion paper) from the recent 2001 Johns Hopkins Summer Workshop. These results indicate that significant gains result from discriminative structural analysis of both conventional MFCC and novel AM-FM features on the Aurora continuous digits task.
Citations
|
4388
|
Probabilistic Reasoning in Intelligent Systems
– Pearl
- 1988
|
|
586
|
Learning bayesian networks: The combination of knowledge and statistical data
– Heckerman, Geiger, et al.
- 1995
|
|
539
|
Graphical Models
– Lauritzen
- 1996
|
|
146
|
Probabilistic independence networks for hidden markov probability models
– Smyth, Heckerman, et al.
|
|
141
|
The bayesian structural em algorithm
– Friedman
- 1998
|
|
131
|
aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
– Hirsch, Pearce
|
|
81
|
Speech recognition with dynamic bayesian networks
– Zweig, Russell
- 1998
|
|
66
|
The Graphical Models Toolkit: An open source software system for speech and time-series processing
– Bilmes, Zweig
- 2002
|
|
59
|
Buried Markov models for speech recognition
– Bilmes
- 1999
|
|
49
|
Model Selection and Inference: A Practical Information-Theoretic Approach
– Burnham, Anderson
- 2002
|
|
43
|
Dynamic bayesian multinets
– Bilmes
- 2000
|
|
37
|
Natural Statistical Models for Automatic Speech Recognition
– Bilmes
- 1999
|
|
34
|
Large scale discriminative training for speech recognition
– Woodland, Povey
- 2000
|
|
27
|
An improved MMIE training algorithm for speakerindependent small vocabulary, continuous speech recognition
– Normandin, Morgera
- 1991
|
|
24
|
Probabilistic modeling with Bayesian Networks for automatic speech recognition
– Zweig, Russell
- 1998
|
|
10
|
Maximum mutual information estimation of HMM parameters for speech recognition
– Bahl, Brown, et al.
- 1986
|
|
2
|
Structural equation models in the social sciences, chapter A general method for estimating a linear structural equation system. Seminar Press/Harcourt Brace
– Jvreskog
- 1973
|
|
2
|
Best-first model merging for hidden model induction
– Stolcke, Omohundro
- 1994
|