23 citations found. Retrieving documents...
X. HUANG, A. ACERO, and H. W. HON, Spoken Language Processing : A Guide to Theory, Algorithm, and System Development, Prentice Hall PTR, Upper Saddle River, N.J., 2001.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Face Processing - Sanderson (2003)   (Correct)

....constant illumination and testing images with varying illumination, LDA derived features were shown to be still affected, although significantly less than PCA derived features. 2. 3 Pseudo 2D Hidden Markov Model (HMM) Based Techniques Samaria [53] extended 1D HMMs (popular in speech recognition [25, 46]) to pseudo 2D HMMs for use in face recognition. A pseudo 2D HMM for each person consists of a pseudo 2D lattice of states, each describing a distribution of feature vectors belonging to a particular area of the face. Samaria used a multivariate Gaussian [see Eqn. 27) as a model of the ....

X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.


A One-Microphone Algorithm For Reverberant Speech Enhancement - Wu, Wang (2003)   (Correct)

....listening situations: background noise and room reverberation. Many techniques such as spectral subtraction, adaptive noise cancellation, and comb filtering have been developed to improve the perceived quality of speech degraded by background noise, and are effective in low to moderate noise level [7]. Alternatively, computational auditory scene analysis systems treat background noise as distinct sound sources and segregate acoustic waveforms into different streams representing different sources, therefore are capable of segregating speech from noise interference and speech utterances from ....

X. Huang, A. Acero, and H.W. Hon, Spoken Language Processing: a Guide to Theory, Algorithm, and system development, Upper


Face Processing & Frontal Face Verification - Sanderson (2003)   (Correct)

....constant illumination and testing images with varying illumination, LDA derived features were shown to be still affected, although significantly less than PCA derived features. 2. 3 Pseudo 2D Hidden Markov Model (HMM) Based Techniques Samaria [52] extended 1D HMMs (popular in speech recognition [24, 45]) to pseudo 2D HMMs for use in face recognition. A pseudo 2D HMM for each person consists of a pseudo 2D lattice of states, each describing a distribution of feature vectors belonging to a particular area of the face. Samaria used a multivariate Gaussian [see Eqn. 27) as a model of the ....

X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.


Automatic Person Verification Using Speech and Face Information - Sanderson (2002)   (1 citation)  (Correct)

....once it overcomes the resistance of the vocal fold closure, it forces the vocal folds apart. Shortly afterward the air pressure is temporarily equalized, and the vocal folds close again, completing the cycle. The cycle occurs at a typical frequency of 60 160 Hz for males and 160 400 Hz for females [103, 56] (average values are 132 Hz and 223 Hz for males and females, respectively [143] Changes in F0 by the speaker are used to denote prosodic information, such as whether a spoken sentence is a statement or a question. While most speakers are capable of changing their F0 by two octaves, variation of ....

.... Transform (FFT) algorithm [104, 105] The square of the magnitude of the complex spectrum is represented as S (in our experiments we use a 2048 point representation) A set of triangular shaped filters is spaced according to the Mel scale [100] simulating the processing done by the human ear [56, 92, 93]. For filters chosen to cover the telephone bandwidth, the center frequencies are (in Hz) 300, 400, 500, 600, 700, 800, 900, 1000, 1149, 1320, 1516, 1741, 2000, 2297, 2639, 3031 and 3482. Moreover, to simulate critical bandwidths [100] the upper and lower passband frequencies of each filter are ....

[Article contains additional citation context not shown here]

X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.


Speech Processing & Text-Independent Automatic Person Verification - Sanderson (2002)   (Correct)

....once it overcomes the resistance of the vocal fold closure, it forces the vocal folds apart. Shortly afterward the air pressure is temporarily equalized, and the vocal folds close again, completing the cycle. The cycle occurs at a typical frequency of 60 160 Hz for males and 160 400 Hz for females [39, 23] (average values are 132 Hz and 223 Hz for males and females, respectively [54] Changes in F0 by the speaker are used to denote prosodic information, such as whether a spoken sentence is a statement or a question. While most speakers are capable of changing their F0 by two octaves, variation of ....

.... Transform (FFT) algorithm [41, 42] The square of the magnitude of the complex spectrum is represented as S (in our experiments we use a 2048 point representation) A set of triangular shaped lters is spaced according to the Mel scale [40] simulating the processing done by the human ear [23, 34, 35]. For lters chosen to cover the telephone bandwidth, the center frequencies are (in Hz) 300, 400, 500, 600, 700, 800, 900, 1000, 1149, 1320, 1516, 1741, 2000, 2297, 2639, 3031 and 3482. Moreover, to simulate critical bandwidths [40] the upper and lower passband frequencies of each lter are the ....

[Article contains additional citation context not shown here]

X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.


The Impact Of Speech Recognition On Speech Synthesis - Ostendorf, Bulyko (2002)   (1 citation)  (Correct)

....and multi pass search techniques to incorporate adaptation as well as models of increased complexity. A detailed review of these components is beyond the scope of this paper, but we will briefly describe key elements that are built on in speech synthesis. For more details, readers are referred to [79, 36, 33]. We also note that many of the same techniques are also used in speaker recognition [50] Mel cepstral feature extraction is used in some form or another in virtually every state of the art speech recognition system. Using a rate of roughly 100 frames second, speech windows of 20 30ms are ....

X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, NJ, 2001.


Automatic Person Verification Using Speech and Face Information - Sanderson (2002)   (1 citation)  (Correct)

....once it overcomes the resistance of the vocal fold closure, it forces the vocal folds apart. Shortly afterward the air pressure is temporarily equalized, and the vocal folds close again, completing the cycle. The cycle occurs at a typical frequency of 60 160 Hz for males and 160 400 Hz for females [99, 53] (average values are 132 Hz and 223 Hz for males and females, respectively [138] Changes in F0 by the speaker are used to denote prosodic information, such as whether a spoken sentence is a statement or a question. While most speakers are capable of changing their F0 by two octaves, variation of ....

.... Fast Fourier Transform (FFT) algorithm [100, 101] The square of the magnitude of the spectrum is represented as g (in our experiments we use a 2048 point representation) A set of triangular shaped filters is spaced according to the Mel scale [96] simulating the processing done by the human ear [53, 88, 89]. For filters chosen to cover the telephone bandwidth, the center frequencies are (in Hz) 300, 400, 500, 600, 700, 800, 900, 1000, 1149, 1320, 1516, 1741, 2000, 2297, 2639, 3031 and 3482. Moreover, to simulate critical bandwidths [96] the upper and lower passband frequencies of each filter are ....

[Article contains additional citation context not shown here]

X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.


Prosodic Phrasing: Machine and Human Evaluation - Viana, Oliveira, Mata (2001)   (2 citations)  (Correct)

....the evaluators on the reference phrasing, the phrasing of 6 (7 ) sentences was considered unacceptable, 31 (34 ) good and 53 (59 ) acceptable. In this case, the number of sentences for which at least one evaluator made the same phrasing went up to 50 (56 ) 4.5. Break Level Results According to [16]: There are many reasonable places to pause in long sentence, but few where it is critical not to pause . That is, the errors in the assignment of prosodic breaks cannot be found only by matching the breaks in the reference phrasing, there are surely other acceptable locations. The problem is to ....

X. Huang, A. Acero, and H. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall, 2001.


Air- and Bone-Conductive Integrated Microphones.. - Zheng, Liu..   Self-citation (Huang)   (Correct)

No context found.

Xudong Huang,AlexAcero, and Xsiao-Wuen Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall PTR, 2001.


Unit Selection Speech Synthesis In Noise - Milos Cer Nak   (Correct)

No context found.

X. HUANG, A. ACERO, and H. W. HON, Spoken Language Processing : A Guide to Theory, Algorithm, and System Development, Prentice Hall PTR, Upper Saddle River, N.J., 2001.


Generating Statistical Language Models from Interpretation.. - Jonson (2006)   (Correct)

No context found.

Huang X., Acero A., Hon H-W. 2001. Spoken Language Processing: A guide to theory, algorithm and system development. Prentice Hall.


Spoken Dialogue Management Using Hierarchical Reinforcement.. - Cuayáhuitl (2005)   (Correct)

No context found.

Huang, X., Acero, A., and Hon, H. (2001). Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice-Hall.


A Comparative Study of Filter Bank Spacing for - Speech Recognition Ben   (Correct)

No context found.

X. Huang, A. Acero, and H. Hon, Spoken Language Processing: A guide to theory, algorithm, and system development. Prentice Hall, Inc., 2001, iSBN 0-13-022616-5.


An XPath-based Discourse Analysis Module - Dialogue (2004)   (Correct)

No context found.

X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, NJ, 2001, pages 853-855.


Decision Combination in Speech Metadata Extraction - Lin (2003)   (Correct)

No context found.

X. D. Huang, A. Acero, and H. W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall PTR. 2001.


Spectral Features for Automatic Text-Independent Speaker.. - Kinnunen (2003)   (Correct)

No context found.

Huang, X., Acero, A., and Hon, H.-W. Spoken Language Processing: a Guide to Theory, Algorithm, and System Development. Prentice-Hall, New Jersey, 2001.


Outerproduct Of Trajectory Matrix - For Acoustic Modeling   (Correct)

No context found.

X. Huang, A. Acero and H.-W. Hon, Spoken Language Processing { A Guide to Theory, Algorithm and System Development, Prentice Hall, 2001.


Speech recognition in the JAS 39 Gripen aircraft - - Adaptation To Speech   (Correct)

No context found.

X. Huang, A. Acero, and H.-W. Hon. Spoken Language Processing - A Guide to Theory, Algoritm, and System Development. Prentice Hall, 2001.


Accurate Spectral Envelope Estimation for.. - Shiga, King   (Correct)

No context found.

X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing --- A Guide to Theory, Algorithm, and System Development, Prentice Hall, 2001.


Integrating Speaker Identification and Learning with Adaptive.. - Fink, Plötz (2004)   (Correct)

No context found.

Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall, Englewood Cliffs, New Jersey, 2001.


Using Morphossyntactic Information in TTS - Systems Comparing Strategies (2003)   (Correct)

No context found.

X. Huang, A. Acero, and H. Hon. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall, 2001.


Speech Processing & Text-Independent . . . - Sanderson (2002)   (Correct)

No context found.

X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.


Speech Processing & Text-Independent Automatic Person Verification - Sanderson (2002)   (Correct)

No context found.

X. Huang, A. Acero and H-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice Hall PTR, New Jersey, 2001.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC