Download:
|
by Martin Cooke, Phil Green, Ljubomir Josifovski, Ascension Vizinho
in Proc., Robust’99
http://www.dcs.shef.ac.uk/~ljupco/tampere.ps.gz
Add To MetaCart
Abstract:
Human speech perception withstands a wide variety of distortions, both experimentally applied and naturallyoccurring. A novel approach to these situations in robust ASR identifies the spectro-temporal regions which carry reliable speech evidence and treats the remainder as missing or uncertain. This standpoint makes minimal assumptions about any noise background. This paper describes two approaches to the adaptation of continuous-density hidden Markov model-based speech recognisers to deal with missing and uncertain acoustic data. The first computes output probabilities on the basis of the reliable evidence only, while the second estimates values for the unreliable regions by conditioning on the reliable parts. Both techniques are evaluated on the TIDigits corpus for several NOISEX noise sources, using spectral subtraction to identify reliable regions. These studies demonstrate that the two schemes behave comparably, and that both produce a significant performance advantage over spectral subtraction alone.
Citations
|
123
|
Supervised learning from incomplete data via an EM approach
– Ghahramani, Jordan
- 1994
|
|
48
|
Modelling auditory processing and organisation
– Cooke
- 1991
|
|
47
|
Some solutions to the missing features problem in data classification, with application to noise robust
– Morris, Cooke, et al.
- 1998
|
|
42
|
Perceptual restoration of missing speech sounds
– Warren
- 1970
|
|
41
|
Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering and noise
– Lippmann, Carlson
- 1997
|
|
40
|
Speech and Hearing in Communication
– Fletcher
- 1953
|
|
35
|
Speech recognition with primarily temporal cues
– Shannon, Zeng, et al.
- 1995
|
|
24
|
Speech perception without traditional speech cues
– Rémez, Rubin, et al.
- 1981
|
|
22
|
Handling missing data in speech recognition
– Cooke, Green, et al.
- 1994
|
|
16
|
Auditory scene analysis and HMM recognition of speech
– Green, Cooke, et al.
- 1995
|
|
15
|
An Introduction to the Psychology of Hearing (4th Edition
– Moore
- 1997
|
|
12
|
Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits
– Warren, Riener, et al.
- 1995
|
|
10
|
Modelling the perceptual segregation of double vowels with a network of neural oscillators
– Brown, Wang
- 1997
|
|
10
|
Speaker verification in noisy environment with combined spectral subtraction and missing data theory
– Drygajlo, El-Maliki
- 1998
|
|
9
|
Some solutions to the missing feature problem in vision
– Ahmed, Tresp
- 1993
|
|
9
|
Recent advances in robust speech recognition
– Furui
- 1997
|
|
7
|
Modelling the recognition of spectrally reduced speech
– Barker, Cooke
- 1997
|
|
7
|
Noise compensation for speech recognition using probabilistic models
– Holmes, Sedgwick
- 1986
|
|
5
|
On measuring and predicting speech intelligibility, unpublished
– Steeneken
- 1992
|
|
4
|
A neural cocktail-party processor
– Malsburg, Schneider
- 1986
|
|
2
|
Speech recognition in noise environments: a survey
– Gong
- 1995
|
|
2
|
Accurate consonant perception without mid-frequency speech energy
– Lippman
- 1996
|