MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Speech Recognition with Dynamic Bayesian Networks (1998) [81 citations — 9 self]

Download:
Download as a PDF | Download as a PS
by Geoffrey Zweig, Stuart Russell
http://www.cs.berkeley.edu/~russell/papers/aaai98-speech.ps
Add To MetaCart

Abstract:

Dynamic Bayesian networks (DBNs) are a useful tool for representing complex stochastic processes. Recent developments in inference and learning in DBNs allow their use in real-world applications. In this paper, we apply DBNs to the problem of speech recognition. The factored state representation enabled by DBNs allows us to explicitly represent long-term articulatory and acoustic context in addition to the phonetic-state information maintained by hidden Markov models (HMMs). Furthermore, it enables us to model the short-term correlations among multiple observation streams within single time-frames. Given a DBN structure capable of representing these long- and short-term correlations, we applied the EM algorithm to learn models with up to 500,000 parameters. The use of structured DBN models decreased the error rate by 12 to 29 % on a large-vocabulary isolated-word recognition task, compared to a discrete HMM; it also improved significantly on other published results for the same task. This is the first successful application of DBNs to a largescale speech recognition problem. Investigation of the learned models indicates that the hidden state variables are strongly correlated with acoustic properties of the speech signal.

Citations

880 and B-H Juang, Fundamentals of Speech Recognition – Rabiner - 1993
342 Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences – Davis, Mermelstein - 1980
317 Connectionist Speech Recognition: A Hybrid Approach – Bourlard, Morgan - 1994
254 Factorial hidden Markov models – Ghahramani, Jordan - 1997
157 Automatic Speech Recognition: The Development of the SPHINX System – Lee - 1989
155 The EM algorithm for graphical association models with missing data – Lauritzen - 1995
146 Probabilistic independence networks for hidden markov probability models – Smyth, Heckerman, et al.
69 Local learning in probabilistic networks with hidden variables – Russell, Binder, et al. - 1995
41 PhoneBook: A phonetically-rich isolated-word telephone-speech database – Pitrelli, Fong, et al. - 1995
23 Hybrid HMM/ANN systems for training independent tasks: Experiments on phonebook and related improvements – Dupont, Bourlard, et al. - 1997
22 A model for reasoning about persistence and causation. Computational Intelligence 5:142–150 – Dean, Kanazawa - 1989
12 A recurrent error propagation speech recognition system – Robinson, Fallside - 1991
5 Compositional modeling with DPNs – Zweig, Russell - 1997
4 Fusion and Propagation with Multiple Observations – Peot, Shachter - 1991