Results 1 - 10
of
16
Highly sparse representations from dictionaries are unique and independent of the sparseness measure
, 2003
"... ..."
Morphological diversity and source separation
- IEEE Signal Process. Lett
, 2006
"... Abstract—This letter describes a new method for blind source separation, adapted to the case of sources having different morphologies. We show that such morphological diversity leads to a new and very efficient separation method, even in the presence of noise. The algorithm, coined multichannel morp ..."
Abstract
-
Cited by 15 (10 self)
- Add to MetaCart
Abstract—This letter describes a new method for blind source separation, adapted to the case of sources having different morphologies. We show that such morphological diversity leads to a new and very efficient separation method, even in the presence of noise. The algorithm, coined multichannel morphological component analysis (MMCA), is an extension of the morphological component analysis (MCA) method. The latter takes advantage of the sparse representation of structured data in large overcomplete dictionaries to separate features in the data based on their morphology. MCA has been shown to be an efficient technique in such problems as separating an image into texture and piecewise smooth parts or for inpainting applications. The proposed extension, MMCA, extends the above for multichannel data, achieving a better source separation in those circumstances. Furthermore, the new algorithm can efficiently achieve good separation in a noisy context where standard independent component analysis methods fail. The efficiency of the proposed scheme is confirmed in numerical experiments. Index Terms—Blind source separation, morphological component analysis (MCA), sparse representations. I.
Sparse Audio Representations Using the MCLT
, 2005
"... We consider sparse representations of audio based around the Modulated Complex Lapped Transform (MCLT) and a generalized iteratively reweighted least squares algorithm which can be interpreted as a variation of Expectation Maximization. We explore the use of such a representation for both audio codi ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We consider sparse representations of audio based around the Modulated Complex Lapped Transform (MCLT) and a generalized iteratively reweighted least squares algorithm which can be interpreted as a variation of Expectation Maximization. We explore the use of such a representation for both audio coding, comparing this representation to the more traditional Modified Discrete Cosine Transform (MDCT), and for more general signal processing by illustrating the potential of a dual-resolution MCLT representation for audio modification.
1 Dictionary Learning for Sparse Approximations with the Majorization Method
"... Abstract—In order to find sparse approximations of signals, an appropriate generative model for the signal class has to be known. If the model is unknown, it can be adapted using a set of training samples. This paper presents a novel method for dictionary learning and extends the learning problem by ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Abstract—In order to find sparse approximations of signals, an appropriate generative model for the signal class has to be known. If the model is unknown, it can be adapted using a set of training samples. This paper presents a novel method for dictionary learning and extends the learning problem by introducing different constraints on the dictionary. The convergence of the proposed method to a fixed point is guaranteed, unless the accumulation points form a continuum. This holds for different sparsity measures. The majorization method is an optimization method that substitutes the original objective function with a surrogate function that is updated in each optimization step. This method has been used successfully in sparse approximation and statistical estimation (e.g. Expectation Maximization (EM)) problems. This paper shows that the majorization method can be used for the dictionary learning problem too. The proposed method is compared with other methods on both synthetic and real data and different constraints on the dictionary are compared. Simulations show the advantages of the proposed method over other currently available dictionary learning methods not only in terms of average performance but also in terms of computation time.
Sparse Linear Regression With Structured Priors and Application to Denoising of Musical Audio
"... Abstract—We describe in this paper an audio denoising technique based on sparse linear regression with structured priors. The noisy signal is decomposed as a linear combination of atoms belonging to two modified discrete cosine transform (MDCT) bases, plus a residual part containing the noise. One M ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—We describe in this paper an audio denoising technique based on sparse linear regression with structured priors. The noisy signal is decomposed as a linear combination of atoms belonging to two modified discrete cosine transform (MDCT) bases, plus a residual part containing the noise. One MDCT basis has a long time resolution, and thus high frequency resolution, and is aimed at modeling tonal parts of the signal, while the other MDCT basis has short time resolution and is aimed at modeling transient parts (such as attacks of notes). The problem is formulated within a Bayesian setting. Conditional upon an indicator variable which is either 0 or 1, one expansion coefficient is set to zero or given a hierarchical prior. Structured priors are employed for the indicator variables; using two types of Markov chains, persistency along the time axis is favored for expansion coefficients of the tonal layer, while persistency along the frequency axis is favored for the expansion coefficients of the transient layer. Inference about the denoised signal and model parameters is performed using a Gibbs sampler, a standard Markov chain Monte Carlo (MCMC) sampling technique. We present results for denoising of a short glockenspiel excerpt and a long polyphonic music excerpt. Our approach is compared with unstructured sparse regression and with structured sparse regression in a single resolution MDCT basis (no transient layer). The results show that better denoising is obtained, both from signal-to-noise ratio measurements and from subjective criteria, when both a transient and tonal layer are used, in conjunction with our proposed structured prior framework. Index Terms—Bayesian variable selection, denoising, Markov chain Monte Carlo (MCMC) methods, nonlinear signal approximation, sparse component analysis, sparse regression, sparse representations. I.
1 Sparse Representations in Audio and Music: from Coding to Source Separation
"... Abstract—Sparse representations have proved a powerful tool in the analysis and processing of audio signals and already lie at the heart of popular coding standards such as MP3 and Dolby AAC. In this paper we give an overview of a number of current and emerging applications of sparse representations ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract—Sparse representations have proved a powerful tool in the analysis and processing of audio signals and already lie at the heart of popular coding standards such as MP3 and Dolby AAC. In this paper we give an overview of a number of current and emerging applications of sparse representations in areas from audio coding, audio enhancement and music transcription to blind source separation solutions that can solve the “cocktail party problem”. In each case we will show how the prior assumption that the audio signals are approximately sparse in some time-frequency representation allows us to address the associated signal processing task. I.
Fast Sparse Subband Decomposition Using FIRSP
, 2004
"... This paper presents a new fast algorithm for generating sparse signal approximations within an overcomplete subband representation. While the current paper concentrates on sparsifying the Modulated Complex Lapped Transform the theory is applicable to representations composed of a general union of or ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper presents a new fast algorithm for generating sparse signal approximations within an overcomplete subband representation. While the current paper concentrates on sparsifying the Modulated Complex Lapped Transform the theory is applicable to representations composed of a general union of orthonormal bases. We illustrate our method on an audio signal and demonstrate the coding gain of such representations.
Sound Detection And Classification Through Transient Models
, 2004
"... Medical Telesurvey needs human operator assistance by smart information systems. Usual sound classification may be applied to medical monitoring by use of microphones in patient's habitation. Detection is the first step of our sound analysis system and is necessary to extract the significant sounds ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Medical Telesurvey needs human operator assistance by smart information systems. Usual sound classification may be applied to medical monitoring by use of microphones in patient's habitation. Detection is the first step of our sound analysis system and is necessary to extract the significant sounds before initiating the classification step. This paper proposes a detection method using transient models, based upon dyadic trees of wavelet coefficients to insure short detection delay. The classification stage uses a Gaussian Mixture Model classifier with classical acoustical parameters like MFCC. Detection and classification stages are evaluated in experimental recorded noise condition which is nonstationary and more aggressive than simulated white noise and fits with our application. Wavelet filtering methods are proposed to enhance performances in low signal to noise ratios.
Sparse Approximation with Block Incoherent Dictionaries
, 2003
"... this paper, we basically relax some of these strong hypotheses by allowing more redundancy in the dictionaries, through the concept of block incoherence, which basically describes a dictionary that can be represented as the union of incoherent blocks. We show that even pure greedy algorithms can str ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
this paper, we basically relax some of these strong hypotheses by allowing more redundancy in the dictionaries, through the concept of block incoherence, which basically describes a dictionary that can be represented as the union of incoherent blocks. We show that even pure greedy algorithms can strongly benefit from such design by proving a recovery condition under which Matching Pursuit will always pick up correct atoms during the signal expansion. Based on this result, we design an algorithm that constructs a near block incoherent dictionary starting from any initial dictionary. A tree structured greedy algorithm is then proposed as a way of constructing sparse approximations with block incoherent dictionaries. This algorithm presents the important advantage of being much faster than a classical Matching Pursuit. In the same time, it only minimally degrades the quality of approximation thanks to the recovery condition, derived for block incoherent dictionaries. The performance of the proposed algorithm are demonstrated in the context of image representation
SOUND CLASSIFICATION IN A SMART ROOM ENVIRONMENT: AN APPROACH USING GMM AND HMM METHODS
"... Because of cost or convenience reasons, patients or elderly people would be hospitalized at home and smart information systems would be needed in order to assist human operators. In this case, position and physiologic sensors give already numerous informations, but there are few studies for sound us ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Because of cost or convenience reasons, patients or elderly people would be hospitalized at home and smart information systems would be needed in order to assist human operators. In this case, position and physiologic sensors give already numerous informations, but there are few studies for sound use in patient's habitation. However, sound classification and speech recognition may greatly increase the versatility of such a system: this will be provided by detecting short sentences or words which could characterize a distress situation for the patient. Analysis and classification of sounds emitted in patient's habitation may be useful for patient's activity monitoring. GMMs and HMMs are well suited for sound classification. Until now, GMMs are frequently used for sound classification in smart rooms because of their low computational costs, but HMMs should allow a finer analysis: indeed the use of 3 states HMMs should allow better performances by taking into account the variation of the signal according to time. For this framework a new sound corpus was recorded in experimental conditions. This corpus includes 8 sound classes useful for our application. The choice of needed acoustical features and the two approaches are presented. Then an evaluation is made with the initial corpus and with additional experimental noise. The obtained results are compared. At the end of this framework a segmentation module is presented. This module has the ability of extracting isolated sounds in a record by the means of a wavelet filtering method which allows the extraction in noisy conditions.

