18 citations found. Retrieving documents...
D. McAllaster, L. Gillick, F. Scattone, and M. Newman. Fabricating conversational speech data with acoustic models: a program to examine model-data mismatch. In Proceedings of the International Conference on Spoken Language Processing (ICSLP) , 1998.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Pronunciation Modeling Using a Finite-State.. - Hazen.. (2002)   (7 citations)  (Correct)

....within our system reduces word error rates by between 4 and 8 over different test sets when compared against a system using no phonological rewrite rules. 1. INTRODUCTION Pronunciation variation has been identified as a major cause of errors for a variety of automatic speech recognition tasks [8]. In particular, pronunciation variation can be quite severe in spontaneous, conversational speech. To address this problem, this paper presents a pronunciation modeling approach that has been under development at MIT for more than a decade. Our approach systematically models pronunciation ....

D. McAllester, L. Gillick, F. Scattone, and M. Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch," in Proc. ICSLP, Sydney, Australia, December 1998.


Pronunciation Modeling Using a Finite-State.. - Hazen.. (2002)   (7 citations)  (Correct)

....within our system reduces word error rates by between 4 and 8 over different test sets when compared against a system using no phonological rewrite rules. 1. INTRODUCTION Pronunciation variation has been identified as a major cause of errors for a variety of automatic speech recognition tasks [8]. In particular, pronunciation variation can be quite severe in spontaneous, conversational speech. To address this problem, this paper presents a pronunciation modeling approach that has been under development at MIT for more than a decade. Our approach systematically models pronunciation ....

D. McAllester, L. Gillick, F. Scattone, and M. Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch," in Proc. ICSLP, Sydney, Australia, December 1998.


Theory and Practice of Acoustic Confusability - Printz, Olsen (2000)   (2 citations)  (Correct)

....to generate data points. Though it has exactly the same form as an evaluation model, we will refer to such a model when used in this way as a synthesizer model. We are not the first to propose the use of synthetic data in speech recognition, and we note especially the contributions of [12]. 4.2 Computing Acoustic Confusability We now present an algorithm for computing acoustic confusability. Since it relies upon the hidden Markov model formalism, it necessarily involves summation over sequences of states. Our method comprises an exact computation over all state sequences of all ....

Don McAllaster, Larry Gillick, Francesco Scattone, and Mike Newman. Fabricating conversational speech data with acoustic models: A program to examine model--data mismatch. In Proceedings of ICSLP, Sydney, Australia, November 1998. Paper 986.


What Kind Of Pronunciation Variation Is Hard For.. - Jurafsky, Ward.. (2001)   (5 citations)  (Correct)

....Our analysis suggests new areas where future pronunciation models should focus, including syllable deletion. 1. INTRODUCTION Many studies of human to human speech have shown that pronunciation variation is a key factor contributing to the high error rates of current recognizers. For example [1] showed that Switchboard word error decreased from 40 to 8 if the dictionary pronunciation matched the actual pronunciation. While the need for better pronunciation modeling is widely acknowledged, and many previous researchers have attempted to build models of the lexicon which capture this ....

....network [2] as shown in Figure 1. t dx start end ax ix .12 .68 .20 b aw ae 0.85 0.15 .16 .30 .54 .37 .63 Fig. 1. A multiple pronunciation network for about But our own research, and that of others, has shown that these allophone networks do not perform well. Both [3] and [1] showed that blindly adding multiple pronunciations to a Thanks to the National Science Foundation for partial support of this work via IIS 9978025. dictionary, even those shown to improve the performance of a single utterance, substantially increased the word error of a Switchboard recognizer. ....

Don McAllaster, Larry Gillick, Francesco Scattone, and Mike Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch, " in ICSLP-98, Sydney, 1998, vol. 5, pp. 1847--1850.


Moving Beyond the `Beads-On-A-String' Model of Speech - Ostendorf (1999)   (6 citations)  (Correct)

....near doubling of word error rates on the exact same word sequence when it was spoken spontaneously vs. read [2] More recently, McAllister et al. use simulated data in experiments that suggest that poor pronunciation modeling accounts for the bulk of the high error rate on the Switchboard task [3]. Not surprisingly, there have been a large number of research efforts devoted to pronunciation modeling in the last few years, including techniques that use automatic learning, hand written phonological rules and various combinations of the two. Unfortunately, the gains from phone based ....

.... the gains from phone based pronunciation modeling techniques have been disappointing, e.g. reducing word error rates from 40.9 to 38.5 on conversational speech [4] This gain represents a statistically significant improvement on a difficult task, but not the factor of five reduction predicted in [3]. Of course, the factor of five is optimistic because of the match between modeling assumptions in the recognition and simulation of data, but most researchers still share the intuition that there is more to be gained from pronunciation modeling. Many of the pronunciation models that have been ....

D. McAllister, L. Gillick, F. Scattone & M. Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch," Proc. Int. Conf. on Spoken Language Proc., pp. 1847-1850, 1998.


Use Of Higher Level Linguistic Structure In Acoustic.. - Shafran, Ostendorf (2000)   (1 citation)  (Correct)

....the baseline condition of planned, studio recordings (8 9 ) 2] Those sites that also participated in a workshop on conversational speech recognition a few months later reported word error rates of roughly 40 . While many studies have pointed to pronunciation variability as a key problem, e.g. [3, 4], the work on pronunciation modeling in terms of phone level substitutions, deletions and insertions has so far only yielded small performance gains. While such models are surely needed, we conjecture that there may also be a need to represent variation of a more gradient nature. For example, a ....

D. McAllister, L. Gillick, F. Scattone & M. Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch," Proc. ICSLP, pp. 1847-1850, 1998.


Pronunciation modeling by sharing Gaussian densities.. - Saraçlar, Nock.. (2000)   (19 citations)  (Correct)

....used in these experiments did not have probabilities assigned to each entry. The actual pronunciation models we use do assign probabilities to each alternate pronunciation. Thus, in fact, it is possible to obtain even lower WERs than these lower bounds. Similar experiments were conducted by McAllaster, Gillick, Scattone and Newman (1998) using simulated data. Their results also support the hypothesis that knowing the correct pronunciations can result in large gains. 4. Related work Elaborate pronunciation modeling for conversational speech recognition has received attention only in the last few years. Much of the work prior to ....

McAllaster, D., Gillick, L., Scattone, F. & Newman, M. (1998). Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch. Proceedings of the International Conference on Spoken Language Processing (ICSLP), Sydney, Australia, pp. 1847--1850.


Contextual Word and Syllable Pronunciation Models - Fosler-Lussier (1999)   (2 citations)  (Correct)

....and variation due to channel effects in telephone bandwidth speech. 1. INTRODUCTION Recent studies have shown that appropriate pronunciation models for speech recognition systems are critical for good performance in large vocabulary tasks, particularly when the speaking style is spontaneous [5]. One popular approach to pronunciation modeling is to use decision trees to automatically learn patterns of variation within automatically or hand generated transcriptions. Most decision tree based systems model pronunciations on a phone by phone basis. Each baseform phone is associated with a ....

D. McAllaster, L. Gillick, F. Scattone, and M. Newman. Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch. In ICSLP-98, pages 1847--1850, Sydney, 1998.


Incorporating Contextual Phonetics Into Automatic.. - Fosler-Lussier.. (1999)   (1 citation)  (Correct)

....in which phonetic transcriptions of the test speech data do not match the pronunciations found in the recognition dictionary. For example, one system tested on the Switchboard corpus of spontaneous speech produced one third more errors for words pronounced non canonically. McAllaster et al. [10] used simulated acoustic data with their Switchboard recognizer to normalize the effects of misclassifications made by the acoustic (phonetic categorization) model; focusing on the differences between the phonetic transcript of the Switchboard test set and pronunciation models in the dictionary, ....

McAllaster, D., Gillick, L., Scattone, F., and Newman, M. 1998. Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch. In ICSLP-98, pp. 1847--1850, Sydney, Australia.


Recognizing Sloppy Speech - Yu (2004)   (Correct)

No context found.

D. McAllaster, L. Gillick, F. Scattone, and M. Newman. Fabricating conversational speech data with acoustic models: a program to examine model-data mismatch. In Proceedings of the International Conference on Spoken Language Processing (ICSLP) , 1998.


Recognizing Sloppy Speech - Hua Yu Cmu-Lti-   (Correct)

No context found.

D. McAllaster, L. Gillick, F. Scattone, and M. Newman. Fabricating conversational speech data with acoustic models: a program to examine model-data mismatch. In Proceedings of the International Conference on Spoken Language Processing (ICSLP) , 1998.


Feature-based Pronunciation Modeling for Speech Recognition - Karen Livescu And (2004)   (Correct)

No context found.

D. McAllester, L. Gillick, F. Scattone, and M. Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch," ICSLP, Sydney, 1998.


Towards Multi-Domain Speech Understanding with Flexible and.. - Chung (2001)   (2 citations)  (Correct)

No context found.

D. McAllaster, L. Gillack, F. Scattone, and M. Newman. Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch. In Proc. ICSLP '98, volume 5, pages 1847-1850, Sydney, Australia, December 1998.


Feature-based Pronunciation Modeling with Trainable.. - Livescu, Glass (2004)   (Correct)

No context found.

D. McAllester, L. Gillick, F. Scattone, and M. Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch," ICSLP, Sydney, 1998.


Lexicon Adaptation for LVCSR: Speaker.. - Ward, Krech, Yu.. (2002)   (Correct)

No context found.

Don McAllaster, Larry Gillick, Francesco Scattone, and Mike Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch, " in ICSLP-98, Sydney, 1998, vol. 5, pp. 1847--1850.


Speaker Dynamics as a Source of Pronunciation Variability for.. - Bates (2003)   (Correct)

No context found.

D. McAllaster, L. Gillick, F. Scattone, and M. Newman. Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch. In Proceedings of ICSLP, pages 1847-1850, 1998. 119


Towards Multi-Domain Speech Understanding with Flexible and.. - Chung (2001)   (2 citations)  (Correct)

No context found.

D. McAllaster, L. Gillack, F. Scattone, and M. Newman. Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch. In Proc. ICSLP '98, volume 5, pages 1847-1850, Sydney, Australia, December 1998.


Modelling Phonological Rules through Linguistic Hierarchies - Seneff, Wang (2002)   (2 citations)  (Correct)

No context found.

D. McAllaster, L. Gillick, F. Scattone, and M. Newman, "Fabricating Conversational Speech data with Acoustic Models: A Program to Examine Model-data Mismatch," ICSLP-98, Vol. 5, pp. 1847-1850, Sydney, Australia, 1998.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC