#### DMCA

## Sparse and shift-invariant representations of music (2006)

### Cached

### Download Links

- [www.personal.soton.ac.uk]
- [www.see.ed.ac.uk]
- [users.fmrib.ox.ac.uk]
- [eprints.soton.ac.uk]
- [www.elec.qmul.ac.uk]
- [www.see.ed.ac.uk]
- DBLP

### Other Repositories/Bibliography

Venue: | IEEE Transactions on Speech and Audio Processing |

Citations: | 46 - 9 self |

### Citations

1285 |
Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature
- Olshausen, Field
- 1996
(Show Context)
Citation Context ...espect to cannot be evaluated analytically, different strategies have been proposed including Markov Chain Monte Carlo methods [12], [9] and firstorder Laplace approximations [4]. Olshausen and Field =-=[3]-=- proposed a method that uses a delta approximation of the distribution at its maximum a posteriori (MAP) estimate. In this case, the gradient simplifies to where is the MAP estimate of and is the reco... |

703 |
Spikes: Exploring the Neural Code
- Rieke, Warland, et al.
- 1997
(Show Context)
Citation Context ...of this uncertainty of features in time and have to be adapted to incorporate such constraints. Such an adaptation can be based on the neurophysiological principles recently suggested in Rieke et al. =-=[6]-=-, where a generative model of perception was proposed in which the stimulus is reconstructed by a convolution of a neural impulse train with a function describing a certain feature coded by the neuron... |

462 |
Possible principles underlying the transformation of sensory messages.InW.A.Rosenblith(Ed.),Sensory communication
- Barlow
- 1961
(Show Context)
Citation Context ...of the specific physiological and neurological systems. One possible fundamental principle underlying the neurological processes of interpreting and recognising sensory stimuli was proposed by Barlow =-=[1]-=-, who suggested that the main aim of mammalian primary perceptual processing is redundancy reduction. This idea has led to the development of sparse coding techniques to discover structure in natural ... |

352 | Learning overcomplete representations
- Lewicki, Sejnowski
- 2000
(Show Context)
Citation Context ...in aim of mammalian primary perceptual processing is redundancy reduction. This idea has led to the development of sparse coding techniques to discover structure in natural stimuli such as images [2]–=-=[4]-=- and sound [5]. In these methods, redundancy reduction is achieved by representing the signal by a combination of a small number of features taken from a set of elementary waveforms called a dictionar... |

187 | Forming sparse representations by local anti-Hebbian learning
- Földiák
- 1990
(Show Context)
Citation Context ...e main aim of mammalian primary perceptual processing is redundancy reduction. This idea has led to the development of sparse coding techniques to discover structure in natural stimuli such as images =-=[2]-=-–[4] and sound [5]. In these methods, redundancy reduction is achieved by representing the signal by a combination of a small number of features taken from a set of elementary waveforms called a dicti... |

185 |
Dictionary learning algorithms for sparse representation
- Kreutz-Delgado, Murray, et al.
- 2003
(Show Context)
Citation Context ...tion maximization (EM) algorithm proposed by Figueiredo et al. in [17], which for certain prior formulations is equivalent to the FOCUSS algorithm proposed by Rao in [18] and Kreutz-Delgado et al. in =-=[19]-=-. III. APPROXIMATE INFERENCE USING ASUBSET SELECTION STEP Many engineering problems of interest suffer from high dimensionality. In the problems studied here, the length of the expected features can o... |

139 | Probabilistic framework for the adaptation and comparison of image codes
- Lewicki, Olshausen
- 1999
(Show Context)
Citation Context ...inding the maximum likelihood estimate of the marginal likelihood , which can be done by stochastic gradient optimization. The gradient can be estimated using a single data observation and, following =-=[11]-=-, can be written as where denotes expectation. Taking the derivative of with respect to the elements in and assuming a Gaussian error term , this can be written as [11] where the derivative is again w... |

134 | Efficient coding of natural sounds
- Lewicki
(Show Context)
Citation Context ...ity can be measured based on a spectral feature calculated by smoothing the power-spectrum of the features . This is done here by calculating the energy in the spectrum in several frequency bands. In =-=[5]-=-, the statistics of natural sounds have been shown to lead to efficient codes that have a wider frequency support at high frequencies. It was further argued in [5] that for speech, music, and some nat... |

123 | An affine scaling methodology for best basis selection
- Rao, Kreutz-Delgado
- 1999
(Show Context)
Citation Context ...orted below, is to use the expectation maximization (EM) algorithm proposed by Figueiredo et al. in [17], which for certain prior formulations is equivalent to the FOCUSS algorithm proposed by Rao in =-=[18]-=- and Kreutz-Delgado et al. in [19]. III. APPROXIMATE INFERENCE USING ASUBSET SELECTION STEP Many engineering problems of interest suffer from high dimensionality. In the problems studied here, the len... |

116 | Adaptive sparseness for supervised learning
- Figueiredo
(Show Context)
Citation Context ... descent procedure can be applied. Another approach, which is the method used in the experiments reported below, is to use the expectation maximization (EM) algorithm proposed by Figueiredo et al. in =-=[17]-=-, which for certain prior formulations is equivalent to the FOCUSS algorithm proposed by Rao in [18] and Kreutz-Delgado et al. in [19]. III. APPROXIMATE INFERENCE USING ASUBSET SELECTION STEP Many eng... |

109 |
Non-negative matrix factor deconvolution: Extraction of multiple sound sources from monophonic inputs
- Smaragdis
(Show Context)
Citation Context ...uld be to use a phase-blind spectral model. The power spectrum of a time–series is less affected by the exact positions of the block locations and has, therefore, been proposed for feature extraction =-=[14]-=-, [15]. A detailed comparison between phase-blind spectral methods and the shift-invariant time-domain approach for music analysis can be found in [16], where the differences and similarities in the r... |

51 | Learning sparse codes with a mixture-of-gaussians prior
- Olshausen, Millman
(Show Context)
Citation Context ...pect to the individual entries in the matrix . As this expectation with respect to cannot be evaluated analytically, different strategies have been proposed including Markov Chain Monte Carlo methods =-=[12]-=-, [9] and firstorder Laplace approximations [4]. Olshausen and Field [3] proposed a method that uses a delta approximation of the distribution at its maximum a posteriori (MAP) estimate. In this case,... |

45 | Separation of sound sources by convolutive sparse coding
- Virtanen
- 2004
(Show Context)
Citation Context ... to use a phase-blind spectral model. The power spectrum of a time–series is less affected by the exact positions of the block locations and has, therefore, been proposed for feature extraction [14], =-=[15]-=-. A detailed comparison between phase-blind spectral methods and the shift-invariant time-domain approach for music analysis can be found in [16], where the differences and similarities in the represe... |

41 | Learning sparse multiscale image representations
- Sallee, Olshausen
- 2003
(Show Context)
Citation Context ...o the individual entries in the matrix . As this expectation with respect to cannot be evaluated analytically, different strategies have been proposed including Markov Chain Monte Carlo methods [12], =-=[9]-=- and firstorder Laplace approximations [4]. Olshausen and Field [3] proposed a method that uses a delta approximation of the distribution at its maximum a posteriori (MAP) estimate. In this case, the ... |

29 |
A probabilistic approach to single channel blind signal separation
- Jang, Lee
- 2003
(Show Context)
Citation Context .... Previous approaches to single-channel blind source separation reported in the literature either rely on prior knowledge of a source model for each source to be recovered (see, for example, [20] and =-=[21]-=-) or treat the extracted features as individual sources (see, for example, [14] and [15]). The models in [14], [15], and [20] are further based on phase-blind spectral models that recover the sources ... |

24 | Coding time-varying signals using sparse, shift-invariant representations,” Advances in neural information processing systems
- Lewicki, Sejnowski
- 1999
(Show Context)
Citation Context ...ations. It is, however, of advantage to keep the number of free parameters low, which can be done by explicitly enforcing the shift-invariant structure in the dictionary as suggested in [6]–[10], and =-=[13]-=-. To state the model used in these references, we introduce the following notation. From now on, we differentiate between the structured matrix in which all features occur at all possible shifted posi... |

20 | Sparse representations of polyphonic music
- Plumbley, Abdallah, et al.
- 2006
(Show Context)
Citation Context ... therefore, been proposed for feature extraction [14], [15]. A detailed comparison between phase-blind spectral methods and the shift-invariant time-domain approach for music analysis can be found in =-=[16]-=-, where the differences and similarities in the representations found with these two approaches are studied.52 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 1, JANUARY 200... |

18 | Sparse coding of time-varying natural images
- Olshausen
- 2000
(Show Context)
Citation Context .... The shift-invariant sparse coding formulation does, however, not scale with the problem size, making the computational requirements for many real world problems prohibitively large. For example, in =-=[8]-=-, the optimization problem to be solved had 6528 dimensions, while for the experiment reported in Section IV-B, the dimension of the optimization problems is 383 500, which is two orders of magnitudes... |

15 | Sparse coding with invariance constraints
- Wersing, Eggert, et al.
- 2003
(Show Context)
Citation Context ...onvolution of a neural impulse train with a function describing a certain feature coded by the neuron under study. Such methods led to the development of shift-invariant sparse coding proposed in [7]–=-=[10]-=-. Section II reviews the concept of sparse coding and its extension to shift-invariant sparse coding. The shift-invariant sparse coding formulation does, however, not scale with the problem size, maki... |

15 | Underdetermined source separation with structured source priors
- Vincent, Rodet
- 2004
(Show Context)
Citation Context ...y support. Previous approaches to single-channel blind source separation reported in the literature either rely on prior knowledge of a source model for each source to be recovered (see, for example, =-=[20]-=- and [21]) or treat the extracted features as individual sources (see, for example, [14] and [15]). The models in [14], [15], and [20] are further based on phase-blind spectral models that recover the... |

5 |
Emergence of movement-sensitive neurons’ properties by learning a sparse code of natural moving images
- Bogacz, Brown, et al.
- 2001
(Show Context)
Citation Context ... a convolution of a neural impulse train with a function describing a certain feature coded by the neuron under study. Such methods led to the development of shift-invariant sparse coding proposed in =-=[7]-=-–[10]. Section II reviews the concept of sparse coding and its extension to shift-invariant sparse coding. The shift-invariant sparse coding formulation does, however, not scale with the problem size,... |

2 |
Proposals for Performance Measurement in Source Separation,” Institut de Recherche et Coordination Acoustique/Musique
- Gribonval, Benaroya, et al.
- 2003
(Show Context)
Citation Context ...decomposition of the mixture to reconstruct the sources using only those features assigned to each individual source. The performance of this separation was then measured using the method proposed in =-=[22]-=-. This gives us a measure of the signal to interference ratio (SIR), i.e., the ratio of the true source to the interference of the other sources in the estimated source as well as the signal to artefa... |