#### DMCA

## DEEP NMF FOR SPEECH SEPARATION (2015)

Citations: | 1 - 0 self |

### Citations

1218 | Algorithms for Non-negative Matrix Factorization
- Lee, Seung
- 2001
(Show Context)
Citation Context ...of the number of parameters. Index Terms— Deep unfolding, Non-negative Matrix Factorization, Deep Neural Network, Non-negative Back-propagation 1. INTRODUCTION Non-negative matrix factorization (NMF) =-=[1]-=- is a popular algorithm commonly used for challenging single-channel audio source separation tasks, such as speech separation (i.e., speech enhancement in the presence of difficult non-stationary nois... |

267 |
Performance measurement in blind audio source separation
- Vincent, Gribonval, et al.
- 2006
(Show Context)
Citation Context ...sent results on the development set. To reduce complexity we use only 10 % of the training utterances for all methods. Our evaluation measure for speech separation is source-to-distortion ratio (SDR) =-=[12]-=-. 3.1. Feature extraction Each feature vector concatenates T = 9 consecutive frames of left context, ending with the target frame, obtained as short-time Fourier spectral magnitudes, using 25 ms windo... |

152 |
Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis
- Févotte, Bertin, et al.
- 2009
(Show Context)
Citation Context ...an be used to avoid scaling indeterminacy. The basic assumptions can then be written as M ≈ ∑ l Sl ≈ ∑ l W̃lHl = W̃H. (1) The β-divergence, Dβ , is an appropriate cost function for this approximation =-=[9]-=-, which casts inference as an optimization of Ĥ, Ĥ = argmin H Dβ(M | W̃H) + µ|H|1. (2) For β = 1, Dβ is the generalized KL divergence, and β = 2 yields the squared error. An L1 sparsity constraint w... |

85 | Task-driven dictionary learning
- Mairal, Bach, et al.
- 2012
(Show Context)
Citation Context ...lly not trained for good separation performance from a mixture. Recently, discriminative methods have been applied to sparse dictionary based methods to achieve better performance in particular tasks =-=[10]-=-. In a similar way, we can discriminatively train NMF bases for source separation. The following optimization problem for training bases, termed discriminative NMF (DNMF) was proposed in [4, 5]: Ŵ = ... |

70 | The second chimespeech separation and recognition challenge: Datasets, tasks and baselines,”
- Vincent, Barker, et al.
- 2013
(Show Context)
Citation Context ...tor in the f th feature dimension in the kth layer. 3. EXPERIMENTS The deep NMF method was evaluated along with competitive models on the 2nd CHiME Speech Separation and Recognition Challenge corpus1 =-=[11]-=-. The task is speech separation in reverberated noisy mixtures (S = 2, l = 1: speech, l = 2: noise). The background noise, recorded in a home environment, consists of naturally occurring interference ... |

51 | Learning fast approximations of sparse coding
- Gregor, LeCun
(Show Context)
Citation Context ...nal reconstruction layer. Various authors in the machine learning literature have considered unfolding iterative inference procedures into deep networks and discriminatively training their parameters =-=[7]-=-, including some with applications to NMF [8, 5], but without untying the parameters, so they were in essence still within the realm of the original model. 2. DEEP NON-NEGATIVE MATRIX FACTORIZATION NM... |

49 |
Sparse coding and nmf
- Eggert, Korner
(Show Context)
Citation Context ...opologies (number of layers and number of hidden units per layer) in terms of SDR performance on the CHiME development set. Results are shown in Table 1. 3.3. Baseline 2: sparse NMF Sparse NMF (SNMF) =-=[14]-=- is used as a baseline, by optimizing the training objective, W l ,H l = argmin Wl,Hl Dβ(S l | W̃lHl) + µ|Hl|1, (12) for each source, l. A multiplicative update algorithm to optimize (12) for arbitrar... |

33 | Non-negative matrix factorization based compensation of music for automatic speech recognition
- Raj, Virtanen, et al.
- 2010
(Show Context)
Citation Context ...ed for challenging single-channel audio source separation tasks, such as speech separation (i.e., speech enhancement in the presence of difficult non-stationary noises such as music and other speech) =-=[2, 3]-=-. In this context, the basic idea is to represent the features of the sources via sets of basis functions and their activation coefficients, one set per source. Mixtures of signals are then analyzed u... |

17 | Discovering convolutive speech phones using sparseness and non-negativity
- O’Grady, Pearlmutter
- 2007
(Show Context)
Citation Context ...ne, by optimizing the training objective, W l ,H l = argmin Wl,Hl Dβ(S l | W̃lHl) + µ|Hl|1, (12) for each source, l. A multiplicative update algorithm to optimize (12) for arbitrary β ≥ 0 is given by =-=[15]-=-. During training, we set S1 and S2 in (12) to the spectrograms of the concatenated noise-free CHiME training set and the corresponding background noise in the multi-condition training set. This yield... |

15 | Real-time speech separation by semi-supervised nonnegative matrix factorization
- Joder, Weninger, et al.
- 2012
(Show Context)
Citation Context ...ed for challenging single-channel audio source separation tasks, such as speech separation (i.e., speech enhancement in the presence of difficult non-stationary noises such as music and other speech) =-=[2, 3]-=-. In this context, the basic idea is to represent the features of the sources via sets of basis functions and their activation coefficients, one set per source. Mixtures of signals are then analyzed u... |

12 | Deep learning for monaural speech separation
- Huang, Kim, et al.
- 2014
(Show Context)
Citation Context ...ate. Here, based on our experience with model-based approaches, we train the masking function such that, when applied to the mixture, it best reconstructs the clean speech, which was also proposed in =-=[13]-=-. This amounts to optimizing the following objective function for the DNN training: E = ∑ f,t ( yf,tmf,t − slf,t )2 = ∑ f,t ( s̃lf,t − slf,t )2 , (11) where m are the mixture magnitudes and sl are the... |

8 | Deep unfolding: Model-based inspiration of novel deep architectures. arXiv preprint arXiv:1409.2574v4,
- Hershey, Roux, et al.
- 2014
(Show Context)
Citation Context ...folding an iterative inference algorithm from a model-based method and untying its parameters into a deep network architecture is a very general one, called deep unfolding, which we recently proposed =-=[6]-=-. Whereas, for example, conventional sigmoid neural networks can be obtained by unfolding mean-field inference in Markov random fields, deep NMF is not only novel within the NMF literature, but it is ... |

8 | B.: Discriminatively trained recurrent neural networks for single-channel speech separation
- Weninger, Hershey, et al.
- 2014
(Show Context)
Citation Context ...lored to better understand the speed/accuracy trade-off. In subsequent experiments on DNNs, improved features and training procedures brought the best DNN performance to 10.46 dB with 4.1M parameters =-=[16]-=-. Application of these improvements to deep NMF is indicated so that the two methods can be compared on an equal footing. In [16] recurrent networks further improved performance on the same task to 12... |

7 | Discriminative NMF and its application to single-channel source separation
- Weninger, Roux, et al.
- 2014
(Show Context)
Citation Context ... not consider separation performance in the context of a mixture signal. Such optimization, termed discriminative NMF, is generally difficult, but recently two different approaches have been proposed =-=[4, 5]-=-. While [5] optimizes the original NMF bases by cleverly solving for some derivatives of the objective function, [4] proposes to circumvent the difficulty of the optimization by generalizing the origi... |

3 | Supervised non-euclidean sparse NMF via bilevel optimization with applications to speech enhancement - Sprechmann, Bronstein, et al. - 2014 |

2 |
Bilevel sparse models for polyphonic music transcription
- Yakar, Litman, et al.
- 2013
(Show Context)
Citation Context ...the machine learning literature have considered unfolding iterative inference procedures into deep networks and discriminatively training their parameters [7], including some with applications to NMF =-=[8, 5]-=-, but without untying the parameters, so they were in essence still within the realm of the original model. 2. DEEP NON-NEGATIVE MATRIX FACTORIZATION NMF operates on a matrix of F -dimensional non-neg... |