MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Automatic detection of corrupt spectrographic features for robust speech recognition (2000) [3 citations — 1 self]

Download:
pdf
by Michael L. Seltzer
http://www.cs.cmu.edu/afs/cs/user/robust/www/Thesis/mseltzer.msthesis.pdf
Add To MetaCart

Abstract:

When speech is corrupted by noise, the performance of automatic speech recognition systems degrades significantly. There have been many algorithms proposed that compensate for the negative effects of noise in speech and greatly improve recognition accuracy. However, these methods assume that the corrupting noise is stationary. If the noise is non-stationary, these methods fail. A promising new group of compensa-tion algorithms have recently emerged which do not have this restriction on the noise characteristics. These methods operate on the notion that noise affects different frequency bands of speech differently depending on the relative energies of the speech and the noise at each time-frequency location. In a spectrographic display of noisy speech, regions of low SNR will be more corrupt than regions of high SNR. Low SNR regions of a spectrogram are considered to be “missing ” or “unreliable ” and are removed from the spectro-gram. Noise compensation is carried out by either estimating the missing regions from the remaining regions in some manner prior to recognition, or by performing recognition directly on incomplete spectro-grams. These techniques clearly require a "spectrographic mask " which accurately labels the reliable and unreliable regions of a spectrogram. Currently, there are no good techniques for accurately estimating such a mask. The methods that have been used so far rely on the assumptions about the interfering noise such as

Citations

4345 Maximum likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977
2961 Pattern Classification and Scene Analysis – Duba, Hart - 1973
880 and B-H Juang, Fundamentals of Speech Recognition – Rabiner - 1993
342 Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences – Davis, Mermelstein - 1980
269 Maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains – Gauvain, Lee - 1994
250 Auditory scene analysis – Bregman - 1990
192 Suppression of acoustic noise in speech using spectral subtraction – Boll - 1979
149 An Introduction to the Psychology of Hearing – Moore - 1997
136 Modern Spectral Estimation: Theory and Application – Kay - 1988
101 Environmental robustness in automatic speech recognition – Acero, Stern
93 Prediction–driven computational auditory scene analysis – Ellis - 1996
88 Speech database development: Design and analysis . Report no – Lemel, Kassel, et al. - 1986
79 Computational auditory scene analysis – Brown, Cooke - 1994
75 Robust automatic speech recognition with missing and unreliable acoustic data – Cooke, Green, et al. - 2001
74 Virtual Pitch and Phase Sensitivity of a Computer Model of the Auditory Periphery – Meddis, Hewitt - 1991
73 The DARPA 1000-Word Resource Management Database for Continuous Speech Recognition – Price, Fisher, et al. - 1988
63 Speech communications : human and machine – O’Shaughnessy - 1999
61 A Joint Synchrony/Mean-Rate Model of Auditory Speech Processing – Seneff - 1988
55 Speech recognition by machines and humans – Lippmann - 1997
44 Noise estimation techniques for robust speech recognition – Hirsch, Ehrlicher - 1995
41 Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering and noise – Lippmann, Carlson - 1997
40 Speech and Hearing in Communication – Fletcher - 1953
40 Speaker Adaptation Of HMMs Using Linear Regression,” 24 – Leggetter, Woodland - 1994
37 Speech Recognition in Noisy Environments – Moreno - 1996
34 Pitch Determination of Speech Signals: Algorithms and Devices – Hess - 1983
31 A Robust Algorithm for Pitch Tracking (RAPT – Talkin - 1995
25 Missing data techniques for robust speech recognition – Cooke, Morris, et al. - 1997
24 Multi-microphone correlation-based processing for robust speech recognition – Sullivan, Stern
23 HMM recognition in noise using parallel model combination – Gales, Young - 1993
22 Handling missing data in speech recognition – Cooke, Green, et al. - 1994
10 A computer model of auditory stream segregation – Beauvois, Meddis - 1991
10 Maximum likelihood estimation for mixed continuous and categorical data with missing values – Little, Schluchter - 1985
9 Cochannel Speaker Separation by Harmonic Enhancement and Suppression – Morgan, George, et al. - 1997
9 Reconstruction of Incomplete Spectrograms for Robust Speech Recognition – Raj - 2000
8 A comparative study of several pitch detection algorithms – Rabiner, Cheng, et al. - 1976
6 An Approach to Co-Channel Talker Interference Suppression Using a Sinusoidal Model for Speech – Quatieri, Danisewicz - 1990
6 A theory and computational model of monaural auditory sound separation – Weintraub - 1985
2 Auditory Organization and Speech Perception: Pointers for Robust ASR – Cooke, Green - 2000
2 Optimal estimators for spectral estimators of noisy speech – Porter, Boll - 1984
2 Probability Theory, Random Processes, and Estimation Theory for Engineers – Stark, Woods - 1994