Results 1 
1 of
1
Nonnegative matrix factorization under heavy noise.
 In Proceedings of the 33nd International Conference on Machine Learning,
, 2016
"... Abstract The Noisy Nonnegative Matrix factorization (NMF) is: given a data matrix A (d × n), find nonnegative matrices B, C (d × k, k × n respy.) so that A = BC + N , where N is a noise matrix. Existing polynomial time algorithms with proven error guarantees require each column N ·,j to have l 1 ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract The Noisy Nonnegative Matrix factorization (NMF) is: given a data matrix A (d × n), find nonnegative matrices B, C (d × k, k × n respy.) so that A = BC + N , where N is a noise matrix. Existing polynomial time algorithms with proven error guarantees require each column N ·,j to have l 1 norm much smaller than (BC) ·,j  1 , which could be very restrictive. In important applications of NMF such as Topic Modeling as well as theoretical noise models (eg. Gaussian with high σ), almost every column of N ·j violates this condition. We introduce the heavy noise model which only requires the average noise over large subsets of columns to be small. We initiate a study of Noisy NMF under the heavy noise model. We show that our noise model subsumes noise models of theoretical and practical interest (for eg. Gaussian noise of maximum possible σ). We then devise an algorithm TSVDNMF which under certain assumptions on B, C, solves the problem under heavy noise. Our error guarantees match those of previous algorithms. Our running time of O((n + d) 2 k) is substantially better than the O(n 3 d) for the previous best. Our assumption on B is weaker than the "Separability" assumption made by all previous results. We provide empirical justification for our assumptions on C. We also provide the first proof of identifiability (uniqueness of B) for noisy NMF which is not based on separability and does not use hard to check geometric conditions. Our algorithm outperforms earlier polynomial time algorithms both in time and error, particularly in the presence of high noise.