| X. HUANG, A. ACERO, and H. W. HON, Spoken Language Processing : A Guide to Theory, Algorithm, and System Development, Prentice Hall PTR, Upper Saddle River, N.J., 2001. |
....constant illumination and testing images with varying illumination, LDA derived features were shown to be still affected, although significantly less than PCA derived features. 2. 3 Pseudo 2D Hidden Markov Model (HMM) Based Techniques Samaria [53] extended 1D HMMs (popular in speech recognition [25, 46]) to pseudo 2D HMMs for use in face recognition. A pseudo 2D HMM for each person consists of a pseudo 2D lattice of states, each describing a distribution of feature vectors belonging to a particular area of the face. Samaria used a multivariate Gaussian [see Eqn. 27) as a model of the ....
X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.
....listening situations: background noise and room reverberation. Many techniques such as spectral subtraction, adaptive noise cancellation, and comb filtering have been developed to improve the perceived quality of speech degraded by background noise, and are effective in low to moderate noise level [7]. Alternatively, computational auditory scene analysis systems treat background noise as distinct sound sources and segregate acoustic waveforms into different streams representing different sources, therefore are capable of segregating speech from noise interference and speech utterances from ....
X. Huang, A. Acero, and H.W. Hon, Spoken Language Processing: a Guide to Theory, Algorithm, and system development, Upper
....constant illumination and testing images with varying illumination, LDA derived features were shown to be still affected, although significantly less than PCA derived features. 2. 3 Pseudo 2D Hidden Markov Model (HMM) Based Techniques Samaria [52] extended 1D HMMs (popular in speech recognition [24, 45]) to pseudo 2D HMMs for use in face recognition. A pseudo 2D HMM for each person consists of a pseudo 2D lattice of states, each describing a distribution of feature vectors belonging to a particular area of the face. Samaria used a multivariate Gaussian [see Eqn. 27) as a model of the ....
X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.
....once it overcomes the resistance of the vocal fold closure, it forces the vocal folds apart. Shortly afterward the air pressure is temporarily equalized, and the vocal folds close again, completing the cycle. The cycle occurs at a typical frequency of 60 160 Hz for males and 160 400 Hz for females [103, 56] (average values are 132 Hz and 223 Hz for males and females, respectively [143] Changes in F0 by the speaker are used to denote prosodic information, such as whether a spoken sentence is a statement or a question. While most speakers are capable of changing their F0 by two octaves, variation of ....
.... Transform (FFT) algorithm [104, 105] The square of the magnitude of the complex spectrum is represented as S (in our experiments we use a 2048 point representation) A set of triangular shaped filters is spaced according to the Mel scale [100] simulating the processing done by the human ear [56, 92, 93]. For filters chosen to cover the telephone bandwidth, the center frequencies are (in Hz) 300, 400, 500, 600, 700, 800, 900, 1000, 1149, 1320, 1516, 1741, 2000, 2297, 2639, 3031 and 3482. Moreover, to simulate critical bandwidths [100] the upper and lower passband frequencies of each filter are ....
[Article contains additional citation context not shown here]
X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.
....once it overcomes the resistance of the vocal fold closure, it forces the vocal folds apart. Shortly afterward the air pressure is temporarily equalized, and the vocal folds close again, completing the cycle. The cycle occurs at a typical frequency of 60 160 Hz for males and 160 400 Hz for females [39, 23] (average values are 132 Hz and 223 Hz for males and females, respectively [54] Changes in F0 by the speaker are used to denote prosodic information, such as whether a spoken sentence is a statement or a question. While most speakers are capable of changing their F0 by two octaves, variation of ....
.... Transform (FFT) algorithm [41, 42] The square of the magnitude of the complex spectrum is represented as S (in our experiments we use a 2048 point representation) A set of triangular shaped lters is spaced according to the Mel scale [40] simulating the processing done by the human ear [23, 34, 35]. For lters chosen to cover the telephone bandwidth, the center frequencies are (in Hz) 300, 400, 500, 600, 700, 800, 900, 1000, 1149, 1320, 1516, 1741, 2000, 2297, 2639, 3031 and 3482. Moreover, to simulate critical bandwidths [40] the upper and lower passband frequencies of each lter are the ....
[Article contains additional citation context not shown here]
X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.
....and multi pass search techniques to incorporate adaptation as well as models of increased complexity. A detailed review of these components is beyond the scope of this paper, but we will briefly describe key elements that are built on in speech synthesis. For more details, readers are referred to [79, 36, 33]. We also note that many of the same techniques are also used in speaker recognition [50] Mel cepstral feature extraction is used in some form or another in virtually every state of the art speech recognition system. Using a rate of roughly 100 frames second, speech windows of 20 30ms are ....
X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, NJ, 2001.
....once it overcomes the resistance of the vocal fold closure, it forces the vocal folds apart. Shortly afterward the air pressure is temporarily equalized, and the vocal folds close again, completing the cycle. The cycle occurs at a typical frequency of 60 160 Hz for males and 160 400 Hz for females [99, 53] (average values are 132 Hz and 223 Hz for males and females, respectively [138] Changes in F0 by the speaker are used to denote prosodic information, such as whether a spoken sentence is a statement or a question. While most speakers are capable of changing their F0 by two octaves, variation of ....
.... Fast Fourier Transform (FFT) algorithm [100, 101] The square of the magnitude of the spectrum is represented as g (in our experiments we use a 2048 point representation) A set of triangular shaped filters is spaced according to the Mel scale [96] simulating the processing done by the human ear [53, 88, 89]. For filters chosen to cover the telephone bandwidth, the center frequencies are (in Hz) 300, 400, 500, 600, 700, 800, 900, 1000, 1149, 1320, 1516, 1741, 2000, 2297, 2639, 3031 and 3482. Moreover, to simulate critical bandwidths [96] the upper and lower passband frequencies of each filter are ....
[Article contains additional citation context not shown here]
X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.
....the evaluators on the reference phrasing, the phrasing of 6 (7 ) sentences was considered unacceptable, 31 (34 ) good and 53 (59 ) acceptable. In this case, the number of sentences for which at least one evaluator made the same phrasing went up to 50 (56 ) 4.5. Break Level Results According to [16]: There are many reasonable places to pause in long sentence, but few where it is critical not to pause . That is, the errors in the assignment of prosodic breaks cannot be found only by matching the breaks in the reference phrasing, there are surely other acceptable locations. The problem is to ....
X. Huang, A. Acero, and H. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall, 2001.
No context found.
Xudong Huang,AlexAcero, and Xsiao-Wuen Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall PTR, 2001.
No context found.
X. HUANG, A. ACERO, and H. W. HON, Spoken Language Processing : A Guide to Theory, Algorithm, and System Development, Prentice Hall PTR, Upper Saddle River, N.J., 2001.
No context found.
Huang X., Acero A., Hon H-W. 2001. Spoken Language Processing: A guide to theory, algorithm and system development. Prentice Hall.
No context found.
Huang, X., Acero, A., and Hon, H. (2001). Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice-Hall.
No context found.
X. Huang, A. Acero, and H. Hon, Spoken Language Processing: A guide to theory, algorithm, and system development. Prentice Hall, Inc., 2001, iSBN 0-13-022616-5.
No context found.
X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, NJ, 2001, pages 853-855.
No context found.
X. D. Huang, A. Acero, and H. W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall PTR. 2001.
No context found.
Huang, X., Acero, A., and Hon, H.-W. Spoken Language Processing: a Guide to Theory, Algorithm, and System Development. Prentice-Hall, New Jersey, 2001.
No context found.
X. Huang, A. Acero and H.-W. Hon, Spoken Language Processing { A Guide to Theory, Algorithm and System Development, Prentice Hall, 2001.
No context found.
X. Huang, A. Acero, and H.-W. Hon. Spoken Language Processing - A Guide to Theory, Algoritm, and System Development. Prentice Hall, 2001.
No context found.
X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing --- A Guide to Theory, Algorithm, and System Development, Prentice Hall, 2001.
No context found.
Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall, Englewood Cliffs, New Jersey, 2001.
No context found.
X. Huang, A. Acero, and H. Hon. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall, 2001.
No context found.
X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.
No context found.
X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC