Results 1 - 10
of
125
Blind separation of speech mixtures via time-frequency masking
- IEEE TRANSACTIONS ON SIGNAL PROCESSING (2002) SUBMITTED
, 2004
"... Binary time-frequency masks are powerful tools for the separation of sources from a single mixture. Perfect demixing via binary time-frequency masks is possible provided the time-frequency representations of the sources do not overlap: a condition we call-disjoint orthogonality. We introduce here t ..."
Abstract
-
Cited by 322 (5 self)
- Add to MetaCart
(Show Context)
Binary time-frequency masks are powerful tools for the separation of sources from a single mixture. Perfect demixing via binary time-frequency masks is possible provided the time-frequency representations of the sources do not overlap: a condition we call-disjoint orthogonality. We introduce here the concept of approximate-disjoint orthogonality and present experimental results demonstrating the level of approximate W-disjoint orthogonality of speech in mixtures of various orders. The results demonstrate that there exist ideal binary time-frequency masks that can separate several speech signals from one mixture. While determining these masks blindly from just one mixture is an open problem, we show that we can approximate the ideal masks in the case where two anechoic mixtures are provided. Motivated by the maximum likelihood mixing parameter estimators, we define a power weighted two-dimensional (2-D) histogram constructed from the ratio of the time-frequency representations of the mixtures that is shown to have one peak for each source with peak location corresponding to the relative attenuation and delay mixing parameters. The histogram is used to create time-frequency masks that partition one of the mixtures into the original sources. Experimental results on speech mixtures verify the technique. Example demixing results can be found online at
On ideal binary mask as the computational goal of auditory scene analysis
- in Speech Separation by Humans and Machines
, 2005
"... In a natural environment, a target sound, such as speech, is usually mixed with acoustic interference. A sound separation system that removes or attenuates acoustic interference has many important applications, such as automatic speech recognition (ASR) and speaker identification in real ..."
Abstract
-
Cited by 99 (40 self)
- Add to MetaCart
(Show Context)
In a natural environment, a target sound, such as speech, is usually mixed with acoustic interference. A sound separation system that removes or attenuates acoustic interference has many important applications, such as automatic speech recognition (ASR) and speaker identification in real
Learning spectral clustering, with application to speech separation
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive new cost fun ..."
Abstract
-
Cited by 70 (6 self)
- Add to MetaCart
Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive new cost functions for spectral clustering based on measures of error between a given partition and a solution of the spectral relaxation of a minimum normalized cut problem. Minimizing these cost functions with respect to the partition leads to new spectral clustering algorithms. Minimizing with respect to the similarity matrix leads to algorithms for learning the similarity matrix from fully labelled datasets. We apply our learning algorithm to the blind one-microphone speech separation problem, casting the problem as one of segmentation of the spectrogram.
A bayesian approach for blind separation of sparse sources
- IEEE Transactions on Speech and Audio Processing
, 2005
"... We present a Bayesian approach for blind separation of linear instantaneous mixtures of sources having a sparse representation in a given basis. The distributions of the coefficients of the sources in the basis are modeled by a Student t distribution, which can be expressed as a Scale Mixture of Gau ..."
Abstract
-
Cited by 66 (10 self)
- Add to MetaCart
(Show Context)
We present a Bayesian approach for blind separation of linear instantaneous mixtures of sources having a sparse representation in a given basis. The distributions of the coefficients of the sources in the basis are modeled by a Student t distribution, which can be expressed as a Scale Mixture of Gaussians, and a Gibbs sampler is derived to estimate the sources, the mixing matrix, the input noise variance and also the hyperparameters of the Student t distributions. The method allows for separation of underdetermined (more sources than sensors) noisy mixtures. Results are presented with audio signals using a Modified Discrete Cosine Transfrom basis and compared with a finite mixture of Gaussians prior approach. These results show the improved sound quality obtained with the Student t prior and the better robustness to mixing matrices close to singularity of the Markov Chains Monte Carlo approach.
Survey of Sparse and Non-Sparse Methods in Source Separation
, 2005
"... Source separation arises in a variety of signal processing applications, ranging from speech processing to medical image analysis. The separation of a superposition of multiple signals is accomplished by taking into account the structure of the mixing process and by making assumptions about the sour ..."
Abstract
-
Cited by 51 (1 self)
- Add to MetaCart
(Show Context)
Source separation arises in a variety of signal processing applications, ranging from speech processing to medical image analysis. The separation of a superposition of multiple signals is accomplished by taking into account the structure of the mixing process and by making assumptions about the sources. When the information about the mixing process and sources is limited, the problem is called ‘blind’. By assuming that the sources can be represented sparsely in a given basis, recent research has demonstrated that solutions to previously problematic blind source separation problems can be obtained. In some cases, solutions are possible to problems intractable by previous non-sparse methods. Indeed, sparse methods provide a powerful approach to the separation of linear mixtures of independent data. This paper surveys the recent arrival of sparse blind source separation methods and the previously existing non-sparse methods, providing insights and appropriate hooks into the literature along the way.
Real-Time Time-Frequency Based Blind Source Separation
- in Proc. of International Conference on Independent Component Analysis and Signal Separation (ICA2001
, 2001
"... We present a real-time version of the DUET algorithm for the blind separation of any number of sources using only two mixtures. The method applies when sources are Wdisjoint orthogonal, that is, when the supports of the windowed Fourier transform of any two signals in the mixture are disjoint sets, ..."
Abstract
-
Cited by 47 (11 self)
- Add to MetaCart
(Show Context)
We present a real-time version of the DUET algorithm for the blind separation of any number of sources using only two mixtures. The method applies when sources are Wdisjoint orthogonal, that is, when the supports of the windowed Fourier transform of any two signals in the mixture are disjoint sets, an assumption which is justified in the Appendix. The online algorithm is a Maximum Likelihood (ML) based gradient search method that is used to track the mixing parameters. The estimates of the mixing parameters are then used to partition the time-frequency representation of the mixtures to recover the original sources. The technique is valid even in the case when the number of sources is larger than the number of mixtures.
Model-based expectationmaximization source separation and localization,”
- IEEE Trans. Audio, Speech, and Language Process. (ASLP),
, 2010
"... ..."
Underdetermined blind source separation based on sparse representation
- IEEE Transactions on Signal Processing
, 2006
"... Abstract—This paper discusses underdetermined (i.e., with more sources than sensors) blind source separation (BSS) using a two-stage sparse representation approach. The first challenging task of this approach is to estimate precisely the unknown mixing matrix. In this paper, an algorithm for estimat ..."
Abstract
-
Cited by 41 (11 self)
- Add to MetaCart
Abstract—This paper discusses underdetermined (i.e., with more sources than sensors) blind source separation (BSS) using a two-stage sparse representation approach. The first challenging task of this approach is to estimate precisely the unknown mixing matrix. In this paper, an algorithm for estimating the mixing matrix that can be viewed as an extension of the DUET and the TIFROM methods is first developed. Standard clustering algorithms (e.g., K-means method) also can be used for estimating the mixing matrix if the sources are sufficiently sparse. Compared with the DUET, the TIFROM methods, and standard clustering algorithms, with the authors ’ proposed method, a broader class of problems can be solved, because the required key condition on sparsity of the sources can be considerably relaxed. The second task of the two-stage approach is to estimate the source matrix using a standard linear programming algorithm. Another main contribution of the work described in this paper is the development of a recoverability analysis. After extending the results in [7], a necessary and sufficient condition for recoverability of a source vector is obtained. Based on this condition and various types of source sparsity, several probability inequalities and probability estimates for the recoverability issue are established. Finally, simulation results that illustrate the effectiveness of the theoretical results are presented. Index Terms—Blind source separation (BSS), I-norm, probability, recoverability, sparse representation, wavelet packets. I.
Proposals for performance measurement in source separation
- in Proc. 4th Int. Symp. on Independent Component Anal. and Blind Signal Separation (ICA2003
, 2003
"... In this paper, we address a few issues related to the evaluation of the performance of source separation algorithms. We propose several measures of distortion that take into account the gain indeterminacies of BSS algorithms. The total distortion includes interference from the other sources as well ..."
Abstract
-
Cited by 41 (16 self)
- Add to MetaCart
(Show Context)
In this paper, we address a few issues related to the evaluation of the performance of source separation algorithms. We propose several measures of distortion that take into account the gain indeterminacies of BSS algorithms. The total distortion includes interference from the other sources as well as noise and algorithmic artifacts, and we define performance criteria that measure separately these contributions. The criteria are valid even in the case of correlated sources. When the sources are estimated from a degenerate set of mixtures by applying a demixing matrix, we prove that there are upper bounds on the achievable Source to Interference Ratio. We propose these bounds as benchmarks to assess how well a (linear or nonlinear) BSS algorithm performs on a set of degenerate mixtures. We demonstrate on an example how to use these figures of merit to evaluate and compare the performance of BSS algorithms. 1.
A SURVEY OF CONVOLUTIVE BLIND SOURCE SEPARATION METHODS
- SPRINGER HANDBOOK ON SPEECH PROCESSING AND SPEECH COMMUNICATION
"... In this chapter, we provide an overview of existing algorithms for blind source separation of convolutive audio mixtures. We provide a taxonomy, wherein many of the existing algorithms can be organized, and we present published results from those algorithms that have been applied to real-world audio ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
In this chapter, we provide an overview of existing algorithms for blind source separation of convolutive audio mixtures. We provide a taxonomy, wherein many of the existing algorithms can be organized, and we present published results from those algorithms that have been applied to real-world audio separation tasks.