MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Pronunciation Group

Download:
Download as a PDF | Download as a PS
by Mitchel Weintraub (sri, Eric Fosler (icsi, Charles Galles (dod, Yu-hung Kao (ti, Sanjeev Khudanpur (jhu, Murat Saraclar (jhu, Steven Wegmann (dragon
http://www-speech.sri.com/papers/ws96-pronunciation.ps.gz
Add To MetaCart

Abstract:

Today's recognizers are primarily based on single pronunciations for most words. This means that the burden of modeling phonetic variability falls entirely on acoustic modeling. In addition, certain types of pronunciation variation (phone deletion/reduction, dialect) are impossible to model well at the acoustic level. We suspect that one of the difficulties in recognizing conversational speech (compared to read speech) is the greater variability of pronunciation. We propose to capture this variability by modeling the pronunciations for each word. The goal of this project is to automatically learn a model of word pronunciation from data. We focus on frequent words that appear many times in the Switchboard and Callhome corpora, since a small number of words make up a large fraction of the total errors. We can hope to learn these pronunciations automatically since these words occur many times in the training data. All past attempts in this area have treated pronunciation variants as mutually independent, i.e., under the assumption that any speaker would choose one of the given variants with a given probability, independent of related choices in the same phonological context, conversation, by the same speaker. Such an approach is simple to implement, but increases the number of parameters and the

Citations

304 SWITCHBOARD: telephone speech corpus for research and development – Godfrey, Holliman, et al. - 1992
64 Insights into spoken language gleaned from phonetic transcriptions of the Switchboard corpus. ICSLP-96 – Greenberg, Ellis, et al. - 1996
47 Phonological Structures for Speech Recognition – Cohen - 1989
45 An Information Theoretic Approach to the Automatic Determination – Lucassen, Mercer - 1984
43 A statistical model for generating pronunciation networks – Riley - 1991
42 Multiple-pronunciation lexical modeling in a speaker-independent speech understanding system – Wooters, Stolcke - 1994
23 Dictionary Learning: Performance Through Consistency – Sloboda - 1995
22 Building Multiple Pronunciation Models for Novel Words Using Exploratory – Tajchman, Fosler, et al. - 1995
20 Dictionary Learning For Spontaneous Speech Recognition – Sloboda, Waibel - 1996
20 Identification of contextual factors for pronunciation networks – Chen - 1990
14 Word juncture modeling using phonological rules for HMM-based continuous speech recognition – Giachin, Rosenberg, et al. - 1991
14 The LIMSI continuous speech dictation system: evaluation on the ARPA Wall Street Journal task – Gauvain, Lamel, et al. - 1994
9 Learning phonological rule probabilities from speech corpora with exploratory computational phonology – Tajchman, Jurafsky, et al. - 1995
8 Speech perception and phonemic restorations – Warren, Obusek - 1971
7 Studies for an Adaptive Recognition Lexicon – Cohen, Baldwin, et al. - 1987
6 Decision Trees for – Bahl, Souza, et al. - 1991
6 Automatic phonetic baseform determination – Bahl, Das, et al. - 1991
6 A segment model based approach to speech recognition – Lee, Soong, et al. - 1988
5 Phonological studies for speech recognition – Bernstein, Baldwin, et al. - 1986
4 Subphonetic Modeling for Speech Recognition – Hwang, Huang - 1992
3 A New Class of Fenonic Markov Word Models for Large Vocabulary Continuous Speech Recognition – Bahl, Bellegarda, et al. - 1991
2 Eect of speaking style – Weintraub, Taussig, et al. - 1996
2 Acoustic Subword Models in the Berkeley Restaurant Project – Wooters, Morgan - 1992
1 Acoustic Modeling," presented at the April 29 – Zavaliagkos, McDonough - 1996