8 citations found. Retrieving documents...
T. Hain, P. Woodland, G. Evermann, and D. Povey. The cu-htk march 2000.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Efficient Scalable Encoding for Distributed Speech.. - Srinivasamurthy.. (2003)   (1 citation)  (Correct)

....these models. This however also increases the computation and memory requirements of the system. Current desktop computers usually have sufficient computation and memory resources to support large vocabulary continuous speech recognition (LVCSR) tasks requiring complex acoustic and language models [3]. Mobile devices however have limited computation, memory and storage capabilities. While a simple speech recognizer with a small grammar (e.g. limited size voice dialing) can be implemented on a mobile device, more complex LVCSR tasks require computation and memory beyond what is available, or ....

....MFCCs was determined by dropping one MFCC at a time and finding the effect on N best recognition by DTW. The MFCC that introduces the most error in recognition was declared as the most important and so on. From our experiments the coefficients were ordered from most important to least important as [2,3,0,1,6,7,4,8,5,10,11,9]. Note that the number of models retained at each intermediate step can be variable depending on the unknown utterance. Also at any intermediate step if the number of models retained is only one, then the subsequent recognizers need not be used and the unknown utterance is classified as the digit ....

P. Woodland, T. Hain, G. Evermann, and D. Povey, "CU-HTK march 2001.


Factor analysed hidden Markov models for speech recognition - Rosti, Gales (2003)   (Correct)

....studied in the future. 4.3 Switchboard 68 Hours For the experiments performed in this section, a 68 hour subset of the Switchboard (Hub5) acoustic training data set was used. 862 sides of the Switchboard 1 and 92 sides of the Call Home English were used. The set is described as h5train00sub in [9]. As with Minitrain, the baseline was a gender independent decision tree clustered tied state cross word triphone Gaussian mixture HMM system. The 1998 Switchboard evaluation data set was used for testing. The baseline HMM system word error rates with the order of number of free parameters are ....

T. Hain, P.C. Woodland, G. Evermann, and D. Povey. The CU-HTK March 2000.


Clustering Wide-Contexts and HMM Topologies for Spontaneous.. - Shafran (2001)   (1 citation)  (Correct)

....(pentaphones) are used. Conditioning only on phonemic context does not capture the acoustic variation of conversational speech fully. In recent years, augmenting the context with position of phoneme in the word has brought additional improvements to ASR performance, and it is now widely used (e.g. [55]) This is consistent with observations in linguistic studies about word position e ects on di erent consonants, using electropalatography (EPG) 73] The linguopalatal (tongue palate) contact, which a ect the strength and duration of sound produced, for word initial consonants is signi cantly ....

.... in a read speech corpus, where two separate trees were grown for male and female speakers, with a total of about 80 hours of speech [70, 130] Progressively, it has been employed for larger tasks, and now as much as 250 hours of speech are clustered with pentaphones and 50 word position features [55]. When the number of feature values increase, a few factors start a ecting the automatic training of decision trees. The number of unique labels tend to increase, and the associated sucient statistics needed to train the tree requires large amounts of memory. For example, Figure 5.1 shows the ....

Thomas Hain, Philip Woodland, Gunnar Evermann, and D. Povey. The CU-HTK March 2000.


A Low-Power Accelerator for the SPHINX 3 Speech Recognition.. - Mathew, Davis, Fang (2003)   (1 citation)  (Correct)

No context found.

T. Hain, P. Woodland, G. Evermann, and D. Povey. The cu-htk march 2000.


Conversational Telephone Speech Recognition - Gauvain Lamel Schwenk   (Correct)

No context found.

T. Hain, P.C. Woodland, G. Evermann, D. Povey, "The CUHTK March 2000.


A Gaussian Probability Accelerator for SPHINX 3 - Mathew, Davis, Fang (2003)   (Correct)

No context found.

T. Hain, P. Woodland, G. Evermann, and D. Povey. The cu-htk march 2000.


A Gaussian Probability Accelerator for SPHINX 3 - Mathew, Davis, Fang (2003)   (Correct)

No context found.

T. Hain, P. Woodland, G. Evermann, and D. Povey. The cu-htk march 2000.


Linear Gaussian Models for Speech Recognition - Rosti (2004)   (Correct)

No context found.

T. Hain, P.C. Woodland, G. Evermann, and D. Povey. The CU-HTK March 2000.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC