Results 1 -
5 of
5
A binaural room impulse response database for the evaluation of dereverberation algorithms
- in Proc. Digital Signal Processing (DSP
, 2009
"... This paper describes a new database of binaural room impulse responses (BRIR), referred to as the Aachen Impulse Response (AIR) database. The main field of application of this database is the evaluation of speech enhancement algorithms dealing with room reverberation. The measurements with a dummy h ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
(Show Context)
This paper describes a new database of binaural room impulse responses (BRIR), referred to as the Aachen Impulse Response (AIR) database. The main field of application of this database is the evaluation of speech enhancement algorithms dealing with room reverberation. The measurements with a dummy head took place in a low-reverberant studio booth, an office room, a meeting room and a lecture room. Due to the different dimensions and acoustic properties, it covers a wide range of situations where digital hearing aids or other hands-free devices can be used. Besides the description of the database, a motivation for using binaural instead of monaural measurements is given. Furthermore an example using a coherencebased dereverberation technique is provided to show the advantage of this database for algorithm evaluation. The AIR database is being made available online.
MODEL ORDER SELECTION FOR NON-NEGATIVE MATRIX FACTORIZATION WITH APPLICATION TO SPEECH ENHANCEMENT
"... ABSTRACT This report deals with the application of non-negative matrix factorization (NMF) in speech processing. A Bayesian NMF is used to find the optimal number of basis vectors for the speech signal. The result is validated by performing a speech enhancement task for a set of different number of ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
ABSTRACT This report deals with the application of non-negative matrix factorization (NMF) in speech processing. A Bayesian NMF is used to find the optimal number of basis vectors for the speech signal. The result is validated by performing a speech enhancement task for a set of different number of basis vectors. The algorithm performance is measured with the Source to Distortion Ratio (SDR) that represents the overall quality of speech. The results show that for medium input SNRs, 60 basis vectors for each speaker are sufficient to model the speech spectrogram. NMF produced better SDR results than a recently developed version of Spectral Subtraction algorithm. The window length was found to have a great effect on the results, but zero padding did not influence the results.
Thesis for the degree of Doctor of Philosophy Speech Enhancement Using Nonnegative Matrix Factorization and Hidden Markov Models Mohammadiha, Nasser Speech Enhancement Using Nonnegative Matrix Factorization and Hidden Markov Models
"... Abstract Reducing interference noise in a noisy speech recording has been a challenging task for many years yet has a variety of applications, for example, in handsfree mobile communications, in speech recognition, and in hearing aids. Traditional single-channel noise reduction schemes, such as Wie ..."
Abstract
- Add to MetaCart
Abstract Reducing interference noise in a noisy speech recording has been a challenging task for many years yet has a variety of applications, for example, in handsfree mobile communications, in speech recognition, and in hearing aids. Traditional single-channel noise reduction schemes, such as Wiener filtering, do not work satisfactorily in the presence of non-stationary background noise. Alternatively, supervised approaches, where the noise type is known in advance, lead to higher-quality enhanced speech signals. This dissertation proposes supervised and unsupervised single-channel noise reduction algorithms. We consider two classes of methods for this purpose: approaches based on nonnegative matrix factorization (NMF) and methods based on hidden Markov models (HMM). The contributions of this dissertation can be divided into three main (overlapping) parts. First, we propose NMF-based enhancement approaches that use temporal dependencies of the speech signals. In a standard NMF, the important temporal correlations between consecutive short-time frames are ignored. We propose both continuous and discrete state-space nonnegative dynamical models. These approaches are used to describe the dynamics of the NMF coefficients or activations. We derive optimal minimum mean squared error (MMSE) or linear MMSE estimates of the speech signal using the probabilistic formulations of NMF. Our experiments show that using temporal dynamics in the NMF-based denoising systems improves the performance greatly. Additionally, this dissertation proposes an approach to learn the noise basis matrix online from the noisy observations. This relaxes the assumption of an a-priori specified noise type and enables us to use the NMF-based denoising method in an unsupervised manner. Our experiments show that the proposed approach with online noise basis learning considerably outperforms state-of-the-art methods in different noise conditions. Second, this thesis proposes two methods for NMF-based separation of sources with similar dictionaries. We suggest a nonnegative HMM (NHMM) for babble noise that is derived from a speech HMM. In this approach, speech and babble signals share the same basis vectors, whereas the activation of the basis vectors are different for the two signals over time. We derive an MMSE estimator for the clean speech signal using the proposed NHMM. The objective evaluations and performed subjective listening test show that the i proposed babble model and the final noise reduction algorithm outperform the conventional methods noticeably. Moreover, the dissertation proposes another solution to separate a desired source from a mixture with arbitrarily low artifacts. Third, an HMM-based algorithm to enhance the speech spectra using super-Gaussian priors is proposed . Our experiments show that speech discrete Fourier transform (DFT) coefficients have super-Gaussian rather than Gaussian distributions even if we limit the speech data to come from a specific phoneme. We derive a new MMSE estimator for the speech spectra that uses super-Gaussian priors. The results of our evaluations using the developed noise reduction algorithm support the super-Gaussianity hypothesis.
Original Article Effects of Noise Reduction on Speech Intelligibility, Perceived Listening Effort, and Personal Preference in Hearing-Impaired Listeners
"... This study evaluates the perceptual effects of single-microphone noise reduction in hearing aids. Twenty subjects with moderate sensorineural hearing loss listened to speech in babble noise processed via noise reduction from three different linearly fitted hearing aids. Subjects performed (a) speech ..."
Abstract
- Add to MetaCart
This study evaluates the perceptual effects of single-microphone noise reduction in hearing aids. Twenty subjects with moderate sensorineural hearing loss listened to speech in babble noise processed via noise reduction from three different linearly fitted hearing aids. Subjects performed (a) speech-intelligibility tests, (b) listening-effort ratings, and (c) paired-comparison ratings on noise annoyance, speech naturalness, and overall preference. The perceptual effects of noise reduc-tion differ between hearing aids. The results agree well with those of normal-hearing listeners in a previous study. None of the noise-reduction algorithms improved speech intelligibility, but all reduced the annoyance of noise. The noise reduction that scored best with respect to noise annoyance and preference had the worst intelligibility scores. The trade-off between intelligibility and listening comfort shows that preference measurements might be useful in addition to intelligibility meas-urements in the selection of noise reduction. Additionally, this trade-off should be taken into consideration to create realistic expectations in hearing-aid users.
Germany,
"... Development of algorithms for digital hearing aid processing includes many steps from the first algo-rithmic idea and implementation up to field tests with hearing impaired patients. Each of these steps has its own requirements towards the devel-opment environment. However, a common platform through ..."
Abstract
- Add to MetaCart
(Show Context)
Development of algorithms for digital hearing aid processing includes many steps from the first algo-rithmic idea and implementation up to field tests with hearing impaired patients. Each of these steps has its own requirements towards the devel-opment environment. However, a common platform throughout the whole development process is desir-able. This paper gives an overview of the applica-tion of Linux Audio in the hearing aid algorithm development process. The performance of portable hardware in terms of delay, battery runtime and pro-cessing power is investigated.