Results 1 
6 of
6
Fast search for best representations in multitree dictionaries
 In Wavelet Applications in Signal and Image Processing VIII, Proc. SPIE 4119, 2000. [7] S.G. Mallat. A Wavelet Tour of Signal Processing, Second Edition
, 2006
"... Abstract—We address the best basis problem—or, more generally, the best representation problem: Given a signal, a dictionary of representations, and an additive cost function, the aim is to select the representation from the dictionary which minimizes the cost for the given signal. We develop a new ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
(Show Context)
Abstract—We address the best basis problem—or, more generally, the best representation problem: Given a signal, a dictionary of representations, and an additive cost function, the aim is to select the representation from the dictionary which minimizes the cost for the given signal. We develop a new framework of multitree dictionaries, which includes some previously proposed dictionaries as special cases. We show how to efficiently find the best representation in a multitree dictionary using a recursive treepruning algorithm. We illustrate our framework through several examples, including a novel block image coder, which significantly outperforms both the standard JPEG and quadtreebased methods and is comparable to embedded coders such as JPEG2000 and SPIHT. Index Terms—Best basis, grammar, image compression, JPEG. I.
Extension of Sparse, Adaptive Signal Decompositions to SemiBlind Audio Source Separation
"... Abstract. We apply sparse, fast and flexible adaptive lapped orthogonal transforms to underdetermined audio source separation using the timefrequency masking framework. This normally requires the sources to overlap as little as possible in the timefrequency plane. In this work, we apply our adapti ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Abstract. We apply sparse, fast and flexible adaptive lapped orthogonal transforms to underdetermined audio source separation using the timefrequency masking framework. This normally requires the sources to overlap as little as possible in the timefrequency plane. In this work, we apply our adaptive transform schemes to the semiblind case, in which the mixing system is already known, but the sources are unknown. By assuming that exactly two sources are active at each timefrequency index, we determine both the adaptive transforms and the estimated source coefficients using ℓ 1 norm minimisation. We show average performance of 12–13 dB SDR on speech and music mixtures, and show that the adaptive transform scheme offers improvements in the order of several tenths of a dB over transforms with constant block length. Comparison with previously studied upper bounds suggests that the potential for future improvements is significant. 1
BENCHMARKING FLEXIBLE ADAPTIVE TIMEFREQUENCY TRANSFORMS FOR UNDERDETERMINED AUDIO SOURCE SEPARATION
"... We have implemented several fast and flexible adaptive lapped orthogonal transform (LOT) schemes for underdetermined audio source separation. This is generally addressed by timefrequency masking, requiring the sources to be disjoint in the timefrequency domain. We have already shown that disjointn ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
We have implemented several fast and flexible adaptive lapped orthogonal transform (LOT) schemes for underdetermined audio source separation. This is generally addressed by timefrequency masking, requiring the sources to be disjoint in the timefrequency domain. We have already shown that disjointness can be increased via adaptive dyadic LOTs. By taking inspiration from the windowing schemes used in many audio coding frameworks, we improve on earlier results in two ways. Firstly, we consider nondyadic LOTs which match the timevarying signal structures better. Secondly, we allow for a greater range of overlapping window profiles to decrease window boundary artifacts. This new scheme is benchmarked through oracle evaluations, and is shown to decrease computation time by over an order of magnitude compared to using very general schemes, whilst maintaining high separation performance and flexible signal adaptivity. As the results demonstrate, this work may find practical applications in high fidelity audio source separation. Index Terms — Timefrequency analysis, Discrete cosine transforms, Source separation, Benchmark, Evaluation
Audio Source Separation using Sparse Representations
"... We address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ signif ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research.
Decompositions to SemiBlind Audio Source Separation
, 2010
"... HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
 Add to MetaCart
(Show Context)
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
IEEE TRANSACTIONS ON IMAGE PROCESSING, TO APPEAR 1 Fast Search for Best Representations in Multitree Dictionaries.
"... Abstract — We address the best basis problem—or, more generally, the best representation problem: given a signal, a dictionary of representations, and an additive cost function, the aim is to select the representation from the dictionary which minimizes the cost for the given signal. We develop a ne ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — We address the best basis problem—or, more generally, the best representation problem: given a signal, a dictionary of representations, and an additive cost function, the aim is to select the representation from the dictionary which minimizes the cost for the given signal. We develop a new framework of multitree dictionaries which includes some previously proposed dictionaries as special cases. We show how to efficiently find the best representation in a multitree dictionary using a recursive tree pruning algorithm. We illustrate our framework through several examples, including a novel block image coder which significantly outperforms both the standard JPEG and quadtreebased methods, and is comparable to embedded coders such as JPEG2000 and SPIHT. I. INTRODUCTION. A number of research efforts have recently concentrated