Results 1  10
of
240
Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
 IEEE Trans. On Audio, Speech and Lang. Processing
, 2007
"... Abstract—An unsupervised learning algorithm for the separation of sound sources in onechannel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a timevarying gain ..."
Abstract

Cited by 189 (30 self)
 Add to MetaCart
(Show Context)
Abstract—An unsupervised learning algorithm for the separation of sound sources in onechannel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a timevarying gain. Each sound source, in turn, is modeled as a sum of one or more components. The parameters of the components are estimated by minimizing the reconstruction error between the input spectrogram and the model, while restricting the component spectrograms to be nonnegative and favoring components whose gains are slowly varying and sparse. Temporal continuity is favored by using a cost term which is the sum of squared differences between the gains in adjacent frames, and sparseness is favored by penalizing nonzero gains. The proposed iterative estimation algorithm is initialized with random values, and the gains and the spectra are then alternatively updated using multiplicative update rules until the values converge. Simulation experiments were carried out using generated mixtures of pitched musical instrument samples and drum sounds. The performance of the proposed method was compared with independent subspace analysis and basic nonnegative matrix factorization, which are based on the same linear model. According to these simulations, the proposed method enables a better separation quality than the previous algorithms. Especially, the temporal continuity criterion improved the detection of pitched musical sounds. The sparseness criterion did not produce significant improvements. Index Terms—Acoustic signal analysis, audio source separation, blind source separation, music, nonnegative matrix factorization, sparse coding, unsupervised learning. I.
Generalized nonnegative matrix approximations with Bregman divergences
 In: Neural Information Proc. Systems
, 2005
"... Nonnegative matrix approximation (NNMA) is a recent technique for dimensionality reduction and data analysis that yields a parts based, sparse nonnegative representation for nonnegative input data. NNMA has found a wide variety of applications, including text analysis, document clustering, face/imag ..."
Abstract

Cited by 99 (5 self)
 Add to MetaCart
(Show Context)
Nonnegative matrix approximation (NNMA) is a recent technique for dimensionality reduction and data analysis that yields a parts based, sparse nonnegative representation for nonnegative input data. NNMA has found a wide variety of applications, including text analysis, document clustering, face/image recognition, language modeling, speech processing and many others. Despite these numerous applications, the algorithmic development for computing the NNMA factors has been relatively deficient. This paper makes algorithmic progress by modeling and solving (using multiplicative updates) new generalized NNMA problems that minimize Bregman divergences between the input matrix and its lowrank approximation. The multiplicative update formulae in the pioneering work by Lee and Seung [11] arise as a special case of our algorithms. In addition, the paper shows how to use penalty functions for incorporating constraints other than nonnegativity into the problem. Further, some interesting extensions to the use of “link ” functions for modeling nonlinear relationships are also discussed. 1
Convolutive speech bases and their application to supervised speech separation
 IEEE Transactions on Audio, Speech and Language Processing
, 2007
"... In this paper we present a convolutive basis decomposition method and its application on simultaneous speakers separation from monophonic recordings. The model we propose is a convolutive version of the nonnegative matrix factorization algorithm. Due to the nonnegativity constraint this type of co ..."
Abstract

Cited by 94 (7 self)
 Add to MetaCart
(Show Context)
In this paper we present a convolutive basis decomposition method and its application on simultaneous speakers separation from monophonic recordings. The model we propose is a convolutive version of the nonnegative matrix factorization algorithm. Due to the nonnegativity constraint this type of coding is very well suited for intuitively and efficiently representing magnitude spectra. We present results that reveal the nature of these basis functions and we introduce their utility in separating monophonic mixtures of known speakers.
C.: Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation
 IEEE Trans. Audio, Speech, Language Process
, 2010
"... We consider inference in a general datadriven objectbased model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. Each source is given a model inspired from nonnegative matrix factorization (NMF) with the ItakuraSaito divergence, wh ..."
Abstract

Cited by 79 (17 self)
 Add to MetaCart
(Show Context)
We consider inference in a general datadriven objectbased model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. Each source is given a model inspired from nonnegative matrix factorization (NMF) with the ItakuraSaito divergence, which underlies a statistical model of superimposed Gaussian components. We address estimation of the mixing and source parameters using two methods. The first one consists of maximizing the exact joint likelihood of the multichannel data using an expectationmaximization algorithm. The second method consists of maximizing the sum of individual likelihoods of all channels using a multiplicative update algorithm inspired from NMF methodology. Our decomposition algorithms were applied to stereo music and assessed in terms of blind source separation performance. Index Terms — Multichannel audio, nonnegative matrix factorization, nonnegative tensor factorization, underdetermined convolutive blind source separation. 1.
Multiple fundamental frequency estimation by summing harmonic amplitudes
 in ISMIR
, 2006
"... This paper proposes a conceptually simple and computationally efficient fundamental frequency (F0) estimator for polyphonic music signals. The studied class of estimators calculate the salience, or strength, of a F0 candidate as a weighted sum of the amplitudes of its harmonic partials. A mapping fr ..."
Abstract

Cited by 76 (10 self)
 Add to MetaCart
(Show Context)
This paper proposes a conceptually simple and computationally efficient fundamental frequency (F0) estimator for polyphonic music signals. The studied class of estimators calculate the salience, or strength, of a F0 candidate as a weighted sum of the amplitudes of its harmonic partials. A mapping from the Fourier spectrum to a “F0 salience spectrum” is found by optimization using generated training material. Based on the resulting function, three different estimators are proposed: a “direct ” method, an iterative estimation and cancellation method, and a method that estimates multiple F0s jointly. The latter two performed as well as a considerably more complex reference method. The number
Nonsmooth nonnegative matrix factorization (nsnmf
 IEEE transactions on
, 2006
"... Abstract—We propose a novel nonnegative matrix factorization model that aims at finding localized, partbased, representations of nonnegative multivariate data items. Unlike the classical nonnegative matrix factorization (NMF) technique, this new model, denoted “nonsmooth nonnegative matrix factoriz ..."
Abstract

Cited by 66 (4 self)
 Add to MetaCart
(Show Context)
Abstract—We propose a novel nonnegative matrix factorization model that aims at finding localized, partbased, representations of nonnegative multivariate data items. Unlike the classical nonnegative matrix factorization (NMF) technique, this new model, denoted “nonsmooth nonnegative matrix factorization ” (nsNMF), corresponds to the optimization of an unambiguous cost function designed to explicitly represent sparseness, in the form of nonsmoothness, which is controlled by a single parameter. In general, this method produces a set of basis and encoding vectors that are not only capable of representing the original data, but they also extract highly localized patterns, which generally lend themselves to improved interpretability. The properties of this new method are illustrated with several data sets. Comparisons to previously published methods show that the new nsNMF method has some advantages in keeping faithfulness to the data in the achieving a high degree of sparseness for both the estimated basis and the encoding vectors and in better interpretability of the factors. Index Terms—nonnegative matrix factorization, constrained optimization, datamining, mining methods and algorithms, pattern analysis, feature extraction or construction, sparse, structured, and very large systems. æ 1
Separation of drums from polyphonic music using nonnegative matrix factorization and support vector machine
 In: Proc. EUSIPCO’2005. (2005
, 2005
"... This paper presents a procedure for the separation of pitched musical instruments and drums from polyphonic music. The method is based on twostage processing in which the input signal is first separated into elementary timefrequency components which are then organized into sound sources. Nonnegat ..."
Abstract

Cited by 56 (4 self)
 Add to MetaCart
(Show Context)
This paper presents a procedure for the separation of pitched musical instruments and drums from polyphonic music. The method is based on twostage processing in which the input signal is first separated into elementary timefrequency components which are then organized into sound sources. Nonnegative matrix factorization (NMF) is used to separate the input spectrogram into components having a fixed spectrum with timevarying gain. Each component is classified either to pitched instruments or to drums using a support vector machine (SVM). The classifier is trained using example signals from both classes. Simulation experiments were carried out using mixtures generated from realworld polyphonic music signals. The results indicate that the proposed method enables better separation quality than existing methods based on sinusoidal modeling and onset detection. Demonstration signals are available at
Exemplarbased sparse representations for noise robust automatic speech recognition
, 2010
"... ..."
Nonnegative matrix factor 2D deconvolution for blind single channel source separation
 in 6th International Conference on Independent Component Analysis and Blind Source Separation, Chareston, SC
, 2006
"... Abstract. We present a novel method for blind separation of instruments in polyphonic music based on a nonnegative matrix factor 2D deconvolution algorithm. Using a model which is convolutive in both time and frequency we factorize a spectrogram representation of music into components correspondin ..."
Abstract

Cited by 54 (3 self)
 Add to MetaCart
(Show Context)
Abstract. We present a novel method for blind separation of instruments in polyphonic music based on a nonnegative matrix factor 2D deconvolution algorithm. Using a model which is convolutive in both time and frequency we factorize a spectrogram representation of music into components corresponding to individual instruments. Based on this factorization we separate the instruments using spectrogram masking. The proposed algorithm has applications in computational auditory scene analysis, music information retrieval, and automatic music transcription. 1