Results 1 
8 of
8
Endogenous convolutional sparse representations for translation invariant image subspace models
 Proc. IEEE International Conference on Image Processing (ICIP
, 2014
"... Subspace models for image data sets, constructed by computing sparse representations of each image with respect to other images in the set, have been found to perform very well in a variety of applications, including clustering and classification problems. One of the limitations of these methods, ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Subspace models for image data sets, constructed by computing sparse representations of each image with respect to other images in the set, have been found to perform very well in a variety of applications, including clustering and classification problems. One of the limitations of these methods, however, is that the subspace representation is unable to directly model the effects of nonlinear transformations such as translation, rotation, and dilation that frequently occur in practice. In this paper it is shown that the properties of convolutional sparse representations can be exploited to make these methods translation invariant, thereby simplifying or eliminating the alignment preprocessing task. The potential of the proposed approach is demonstrated in two diverse applications: image clustering and video background modeling.
INFORMED MONAURAL SOURCE SEPARATION OF MUSIC BASED ON CONVOLUTIONAL SPARSE CODING
"... Monaural source separation is a challenging problem that has many important applications in music information retrieval. In this paper, we focus on the scoreinformed variant of this problem. While nonnegative matrix factorization and some other approaches have been shown effective, few existing ap ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Monaural source separation is a challenging problem that has many important applications in music information retrieval. In this paper, we focus on the scoreinformed variant of this problem. While nonnegative matrix factorization and some other approaches have been shown effective, few existing approaches have properly taken the phase information into account. There are unnatural sound in the separation result, as the phase of each source signal is considered equivalent to the phase of the mixed signal. To remedy this, we propose to perform source separation directly in the time domain using a convolutional sparse coding (CSC) approach. Evaluation on the Bach10 dataset shows that, when the instrument, pitch and onset/offset time are informed, the source to distortion ratio of the separation result reaches 8.59 dB, which is 2.02 dB higher than a stateoftheart system called Soundprism. Index Terms — Convolutional sparse coding, dictionary learning, scoreinformed monaural source separation
Convolutional Sparse Coding for Image Superresolution
"... Most of the previous sparse coding (SC) based super resolution (SR) methods partition the image into overlapped patches, and process each patch separately. These methods, however, ignore the consistency of pixels in overlapped patches, which is a strong constraint for image reconstruction. In thi ..."
Abstract
 Add to MetaCart
(Show Context)
Most of the previous sparse coding (SC) based super resolution (SR) methods partition the image into overlapped patches, and process each patch separately. These methods, however, ignore the consistency of pixels in overlapped patches, which is a strong constraint for image reconstruction. In this paper, we propose a convolutional sparse coding (CSC) based SR (CSCSR) method to address the consistency issue. Our CSCSR involves three groups of parameters to be learned: (i) a set of filters to decompose the low resolution (LR) image into LR sparse feature maps; (ii) a mapping function to predict the high resolution (HR) feature maps from the LR ones; and (iii) a set of filters to reconstruct the HR images from the predicted HR feature maps via simple convolution operations. By working directly on the whole image, the proposed CSCSR algorithm does not need to divide the image into overlapped patches, and can exploit the image global correlation to produce more robust reconstruction of image local structures. Experimental results clearly validate the advantages of CSC over patch based SC in SR application. Compared with stateoftheart SR methods, the proposed CSCSR method achieves highly competitive PSNR results, while demonstrating better edge and texture preservation performance. 1.
TRANSLATIONAL AND ROTATIONAL JITTER INVARIANT INCREMENTAL PRINCIPAL COMPONENT PURSUIT FOR VIDEO BACKGROUND MODELING
"... While Principal Component Pursuit (PCP) is currently considered to be the state of the art method for video background modeling, it suffers from a number of limitations, including a high computational cost, a batch operating mode, and sensitivity to camera jitter. In this paper we propose a novel f ..."
Abstract
 Add to MetaCart
(Show Context)
While Principal Component Pursuit (PCP) is currently considered to be the state of the art method for video background modeling, it suffers from a number of limitations, including a high computational cost, a batch operating mode, and sensitivity to camera jitter. In this paper we propose a novel fully incremental PCP algorithm for video background modeling that is robust to translational and rotational jitter. It processes one frame at a time, obtaining similar results to standard batch PCP algorithms, while being able to deal with translational and rotational jitter. It also has extremely low memory footprint, and a computational complexity that allows almost realtime processing.
1Efficient Algorithms for Convolutional Sparse Representations
"... Abstract—When applying sparse representation techniques to images, the standard approach is to independently compute the representations for a set of overlapping image patches. This method performs very well in a variety of applications, but results in a representation that is multivalued and not o ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—When applying sparse representation techniques to images, the standard approach is to independently compute the representations for a set of overlapping image patches. This method performs very well in a variety of applications, but results in a representation that is multivalued and not optimised with respect to the entire image. An alternative representation structure is provided by a convolutional sparse representation, in which a sparse representation of an entire image is computed by replacing the linear combination of a set of dictionary vectors by the sum of a set of convolutions with dictionary filters. The resulting representation is both singlevalued and jointly optimised over the entire image. While this form of sparse representation has been applied to a variety of problems in signal and image processing and computer vision, the computational expense of the corresponding optimisation problems has restricted application to relatively small signals and images. This paper presents new, efficient algorithms that substantially improve on the performance of other recent methods, contributing to the development of this type of representation as a practical tool for a wider range of problems.
PIANO MUSIC TRANSCRIPTIONWITH FAST CONVOLUTIONAL SPARSE CODING
"... Automatic music transcription (AMT) is the process of converting an acoustic musical signal into a symbolic musical representation, such as a MIDI file, which contains the pitches, the onsets and offsets of the notes and, possibly, their dynamics and sources (i.e., instruments). Most existing algor ..."
Abstract
 Add to MetaCart
(Show Context)
Automatic music transcription (AMT) is the process of converting an acoustic musical signal into a symbolic musical representation, such as a MIDI file, which contains the pitches, the onsets and offsets of the notes and, possibly, their dynamics and sources (i.e., instruments). Most existing algorithms for AMT operate in the frequency domain, which introduces the well known time/frequency resolution tradeoff of the Short Time Fourier Transform and its variants. In this paper, we propose a timedomain transcription algorithm based on an efficient convolutional sparse coding algorithm in an instrumentspecific scenario, i.e., the dictionary is trained and tested on the same piano. The proposed method outperforms a current stateoftheart AMT method by over 26 % in Fmeasure, achieving a median Fmeasure of 93.6%, and drastically increases both time and frequency resolutions, especially for the lowest octaves of the piano keyboard.
Anomaly Detection Using Convolutional Sparse Models
"... We address the problem of detecting anomalous regions in images, i.e. regions having a structure that does not conform to normal images in a reference set [1]. Our approach is based on convolutional sparse models [2], which model an image s ∈ Rn1×n2 as the sum of k convolutions between filters dm ∈ ..."
Abstract
 Add to MetaCart
(Show Context)
We address the problem of detecting anomalous regions in images, i.e. regions having a structure that does not conform to normal images in a reference set [1]. Our approach is based on convolutional sparse models [2], which model an image s ∈ Rn1×n2 as the sum of k convolutions between filters dm ∈ Rh1×h2 and sparse feature maps xm ∈ Rn1×n2,m ∈ {1,..., k}, i.e. s ≈ k∑ m=1 dm ∗ xm. (1) Feature maps {xm} of an input image s are computed by a sparse coding algorithm solving the optimization problem [2] arg min {xm} 1
Symmetrized Regression for Hyperspectral Background Estimation
"... We can improve the detection of targets and anomalies in a cluttered background by more effectively estimating that background. With a good estimate of what the targetfree radiance or reflectance ought to be at a pixel, we have a point of comparison with what the measured value of that pixel actual ..."
Abstract
 Add to MetaCart
We can improve the detection of targets and anomalies in a cluttered background by more effectively estimating that background. With a good estimate of what the targetfree radiance or reflectance ought to be at a pixel, we have a point of comparison with what the measured value of that pixel actually happens to be. It is common to make this estimate using the mean of pixels in an annulus around the pixel of interest. But there is more information in the annulus than this mean value, and one can derive more general estimators than just the mean. The derivation pursued here is based on multivariate regression of the central pixel against the pixels in the surrounding annulus. This can be done on a bandbyband basis, or with multiple bands simultaneously. For overhead remote sensing imagery with square pixels, there is a natural eightfold symmetry in the surrounding annulus, corresponding to reflection and right angle rotation. We can use this symmetry to impose constraints on the estimator function, and we can use these constraints to reduce the number or regressor variables in the problem. This paper investigates the utility of regression generally – and a variety of different symmetric regression schemes particularly – for hyperspectral background estimation in the context of generic target detection.