Results 1  10
of
63
A tutorial on onset detection in music signals
 IEEE TRANSACTIONS IN SPEECH AND AUDIO PROCESSING
, 2005
"... Note onset detection and localization is useful in a number of analysis and indexing techniques for musical signals. The usual way to detect onsets is to look for “transient” regions in the signal, a notion that leads to many definitions: a sudden burst of energy, a change in the shorttime spectrum ..."
Abstract

Cited by 182 (15 self)
 Add to MetaCart
(Show Context)
Note onset detection and localization is useful in a number of analysis and indexing techniques for musical signals. The usual way to detect onsets is to look for “transient” regions in the signal, a notion that leads to many definitions: a sudden burst of energy, a change in the shorttime spectrum of the signal or in the statistical properties, etc. The goal of this paper is to review, categorize, and compare some of the most commonly used techniques for onset detection, and to present possible enhancements. We discuss methods based on the use of explicitly predefined signal features: the signal’s amplitude envelope, spectral magnitudes and phases, timefrequency representations; and methods based on probabilistic signal models: modelbased change point detection, surprise signals, etc. Using a choice of test cases, we provide some guidelines for choosing the appropriate method for a given application.
Highly sparse representations from dictionaries are unique and independent of the sparseness measure
, 2003
"... ..."
1 Dictionary Learning for Sparse Approximations with the Majorization Method
"... Abstract—In order to find sparse approximations of signals, an appropriate generative model for the signal class has to be known. If the model is unknown, it can be adapted using a set of training samples. This paper presents a novel method for dictionary learning and extends the learning problem by ..."
Abstract

Cited by 49 (10 self)
 Add to MetaCart
(Show Context)
Abstract—In order to find sparse approximations of signals, an appropriate generative model for the signal class has to be known. If the model is unknown, it can be adapted using a set of training samples. This paper presents a novel method for dictionary learning and extends the learning problem by introducing different constraints on the dictionary. The convergence of the proposed method to a fixed point is guaranteed, unless the accumulation points form a continuum. This holds for different sparsity measures. The majorization method is an optimization method that substitutes the original objective function with a surrogate function that is updated in each optimization step. This method has been used successfully in sparse approximation and statistical estimation (e.g. Expectation Maximization (EM)) problems. This paper shows that the majorization method can be used for the dictionary learning problem too. The proposed method is compared with other methods on both synthetic and real data and different constraints on the dictionary are compared. Simulations show the advantages of the proposed method over other currently available dictionary learning methods not only in terms of average performance but also in terms of computation time.
1 Sparse Representations in Audio and Music: from Coding to Source Separation
"... Abstract—Sparse representations have proved a powerful tool in the analysis and processing of audio signals and already lie at the heart of popular coding standards such as MP3 and Dolby AAC. In this paper we give an overview of a number of current and emerging applications of sparse representations ..."
Abstract

Cited by 38 (9 self)
 Add to MetaCart
(Show Context)
Abstract—Sparse representations have proved a powerful tool in the analysis and processing of audio signals and already lie at the heart of popular coding standards such as MP3 and Dolby AAC. In this paper we give an overview of a number of current and emerging applications of sparse representations in areas from audio coding, audio enhancement and music transcription to blind source separation solutions that can solve the “cocktail party problem”. In each case we will show how the prior assumption that the audio signals are approximately sparse in some timefrequency representation allows us to address the associated signal processing task. I.
Sparse and structured decompositions of signals with the molecular matching pursuit
 IEEE Transactions on Speech and Audio Processing
"... algorithm for the decomposition of signals. The MMP is a practical solution which introduces the notion of structures within the framework of sparse overcomplete representations; these structures are based on the local dependency of significant timefrequency or timescale atoms. We show that thi ..."
Abstract

Cited by 34 (5 self)
 Add to MetaCart
(Show Context)
algorithm for the decomposition of signals. The MMP is a practical solution which introduces the notion of structures within the framework of sparse overcomplete representations; these structures are based on the local dependency of significant timefrequency or timescale atoms. We show that this algorithm is well adapted to the representation of real signals such as percussive audio signals. This is at the cost of a slight suboptimality in terms of the rate of convergence for the approximation error, but the benefits are numerous, most notably a significant reduction in the computational cost, which facilitates the processing of long signals. Results show that this algorithm is very promising for highquality adaptive coding of audio signals. Index Terms—Matching pursuit, overcomplete representations, parametric audio coding, timefrequency transforms. I.
M.: Beyond sparsity: Recovering structured representations by `1 minimization and greedy algorithms
 Advances in Computational Mathematics
, 2008
"... Finding a sparse approximation of a signal from an arbitrary dictionary is a very useful tool to solve many problems in signal processing. Several algorithms, such as Basis Pursuit (BP) and Matching Pursuits (MP, also known as greedy algorithms), have been introduced to compute sparse approximations ..."
Abstract

Cited by 34 (9 self)
 Add to MetaCart
(Show Context)
Finding a sparse approximation of a signal from an arbitrary dictionary is a very useful tool to solve many problems in signal processing. Several algorithms, such as Basis Pursuit (BP) and Matching Pursuits (MP, also known as greedy algorithms), have been introduced to compute sparse approximations of signals, but such algorithms a priori only provide suboptimal solutions. In general, it is difficult to estimate how close a computed solution is from the optimal one. In a series of recent results, several authors have shown that both BP and MP can successfully recover a sparse representation of a signal provided that it is sparse enough, that is to say if its support (which indicates where are located the nonzero coefficients) is of sufficiently small size. In this paper we define identifiable structures that support signals that can be recovered exactly by `1 minimization (Basis Pursuit) and greedy algorithms. In other words, if the support of a representation belongs to an identifiable structure, then the representation will be recovered by BP and MP. In addition, we obtain that if the output of an arbitrary decomposition algorithm is supported on an identifiable structure, then one can be sure that the representation is optimal within the class of signals supported by the structure. As an application of the theoretical results, we give a detailed study of a family of multichannel dictionaries with a special structure (corresponding to the representation problem X = ASΦT) often used in, e.g., underdetermined
Morphological diversity and source separation
 IEEE Signal Process. Lett
, 2006
"... Abstract—This letter describes a new method for blind source separation, adapted to the case of sources having different morphologies. We show that such morphological diversity leads to a new and very efficient separation method, even in the presence of noise. The algorithm, coined multichannel morp ..."
Abstract

Cited by 30 (12 self)
 Add to MetaCart
Abstract—This letter describes a new method for blind source separation, adapted to the case of sources having different morphologies. We show that such morphological diversity leads to a new and very efficient separation method, even in the presence of noise. The algorithm, coined multichannel morphological component analysis (MMCA), is an extension of the morphological component analysis (MCA) method. The latter takes advantage of the sparse representation of structured data in large overcomplete dictionaries to separate features in the data based on their morphology. MCA has been shown to be an efficient technique in such problems as separating an image into texture and piecewise smooth parts or for inpainting applications. The proposed extension, MMCA, extends the above for multichannel data, achieving a better source separation in those circumstances. Furthermore, the new algorithm can efficiently achieve good separation in a noisy context where standard independent component analysis methods fail. The efficiency of the proposed scheme is confirmed in numerical experiments. Index Terms—Blind source separation, morphological component analysis (MCA), sparse representations. I.
Timefrequency jigsaw puzzle: adaptive and multilayered Gabor expansions
 In: Int. Jour. for Wavelets and Multiresolution Information Processing 1.5 (2007
"... HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
(Show Context)
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. 1TimeFrequency Jigsaw Puzzle: Adaptive multiwindow and multilayered Gabor expansions We describe a new adaptive multiwindow Gabor expansion, which dynamically adapts the windows to the signal’s features in timefrequency space. The adaptation is based upon local timefrequency sparsity criteria, and also yields as byproduct an expansion of the signal into layers corresponding to different windows. As an illustration, we show that simply using two different windows with different sizes leads to decompositions of audio signals into transient and tonal layers. We also discuss potential applications to transient detection and denoising.
An hybrid audio scheme using hidden Markov models of waveforms
 Applied and Computational Harmonic Analysis
, 2005
"... Abstract. This paper reports on recent results related to audiophonic signals encoding using timescale and timefrequency transform. More precisely, nonlinear, structured approximations for tonal and transient components using local cosine and wavelet bases will be described, yielding expansions of ..."
Abstract

Cited by 22 (11 self)
 Add to MetaCart
(Show Context)
Abstract. This paper reports on recent results related to audiophonic signals encoding using timescale and timefrequency transform. More precisely, nonlinear, structured approximations for tonal and transient components using local cosine and wavelet bases will be described, yielding expansions of audio signals in the form tonal + transient + residual. We describe a general formulation involving hidden Markov models, together with corresponding rate estimates. Estimators for the balance transient/tonal are also discussed. hal00350467, version 1 6 Jan 2009
Sparse Linear Regression With Structured Priors and Application to Denoising of Musical Audio
 IEEE Trans. Sp. Audio Proc
"... Abstract—We describe in this paper an audio denoising technique based on sparse linear regression with structured priors. The noisy signal is decomposed as a linear combination of atoms belonging to two modified discrete cosine transform (MDCT) bases, plus a residual part containing the noise. One ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
Abstract—We describe in this paper an audio denoising technique based on sparse linear regression with structured priors. The noisy signal is decomposed as a linear combination of atoms belonging to two modified discrete cosine transform (MDCT) bases, plus a residual part containing the noise. One MDCT basis has a long time resolution, and thus high frequency resolution, and is aimed at modeling tonal parts of the signal, while the other MDCT basis has short time resolution and is aimed at modeling transient parts (such as attacks of notes). The problem is formulated within a Bayesian setting. Conditional upon an indicator variable which is either 0 or 1, one expansion coefficient is set to zero or given a hierarchical prior. Structured priors are employed for the indicator variables; using two types of Markov chains, persistency along the time axis is favored for expansion coefficients of the tonal layer, while persistency along the frequency axis is favored for the expansion coefficients of the transient layer. Inference about the denoised signal and model parameters is performed using a Gibbs sampler, a standard Markov chain Monte Carlo (MCMC) sampling technique. We present results for denoising of a short glockenspiel excerpt and a long polyphonic music excerpt. Our approach is compared with unstructured sparse regression and with structured sparse regression in a single resolution MDCT basis (no transient layer). The results show that better denoising is obtained, both from signaltonoise ratio measurements and from subjective criteria, when both a transient and tonal layer are used, in conjunction with our proposed structured prior framework. Index Terms—Bayesian variable selection, denoising, Markov chain Monte Carlo (MCMC) methods, nonlinear signal approximation, sparse component analysis, sparse regression, sparse representations. I.