Results 1  10
of
104
A unified framework for highdimensional analysis of Mestimators with decomposable regularizers
"... ..."
Restricted strong convexity and weighted matrix completion: Optimal bounds with noise
, 2012
"... We consider the matrix completion problem under a form of row/column weighted entrywise sampling, including the case of uniform entrywise sampling as a special case. We analyze the associated random observation operator, and prove that with high probability, it satisfies a form of restricted strong ..."
Abstract

Cited by 84 (10 self)
 Add to MetaCart
We consider the matrix completion problem under a form of row/column weighted entrywise sampling, including the case of uniform entrywise sampling as a special case. We analyze the associated random observation operator, and prove that with high probability, it satisfies a form of restricted strong convexity with respect to weighted Frobenius norm. Using this property, we obtain as corollaries a number of error bounds on matrix completion in the weighted Frobenius norm under noisy sampling and for both exact and near lowrank matrices. Our results are based on measures of the “spikiness” and “lowrankness” of matrices that are less restrictive than the incoherence conditions imposed in previous work. Our technique involves an Mestimator that includes controls on both the rank and spikiness of the solution, and we establish nonasymptotic error bounds in weighted Frobenius norm for recovering matrices lying with ℓq“balls ” of bounded spikiness. Using informationtheoretic methods, we show that no algorithm can achieve better estimates (up to a logarithmic factor) over these same sets, showing that our conditions on matrices and associated rates are essentially optimal.
A Simple Algorithm for Nuclear Norm Regularized Problems
"... Optimization problems with a nuclear norm regularization, such as e.g. low norm matrix factorizations, have seen many applications recently. We propose a new approximation algorithm building upon the recent sparse approximate SDP solver of (Hazan, 2008). The experimental efficiency of our method is ..."
Abstract

Cited by 49 (3 self)
 Add to MetaCart
(Show Context)
Optimization problems with a nuclear norm regularization, such as e.g. low norm matrix factorizations, have seen many applications recently. We propose a new approximation algorithm building upon the recent sparse approximate SDP solver of (Hazan, 2008). The experimental efficiency of our method is demonstrated on large matrix completion problems such as the Netflix dataset. The algorithm comes with strong convergence guarantees, and can be interpreted as a first theoretically justified variant of SimonFunktype SVD heuristics. The method is free of tuning parameters, and very easy to parallelize. 1.
Collaborative filtering in a nonuniform world: Learning with the weighted trace norm
, 2010
"... We show that matrix completion with tracenorm regularization can be significantly hurt when entries of the matrix are sampled nonuniformly, but that a properly weighted version of the tracenorm regularizer works well with nonuniform sampling. We show that the weighted tracenorm regularization i ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
(Show Context)
We show that matrix completion with tracenorm regularization can be significantly hurt when entries of the matrix are sampled nonuniformly, but that a properly weighted version of the tracenorm regularizer works well with nonuniform sampling. We show that the weighted tracenorm regularization indeed yields significant gains on the highly nonuniformly sampled Netflix dataset.
Matrix estimation by universal singular value thresholding
, 2012
"... Abstract. Consider the problem of estimating the entries of a large matrix, when the observed entries are noisy versions of a small random fraction of the original entries. This problem has received widespread attention in recent times, especially after the pioneering works of Emmanuel Candès and ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Consider the problem of estimating the entries of a large matrix, when the observed entries are noisy versions of a small random fraction of the original entries. This problem has received widespread attention in recent times, especially after the pioneering works of Emmanuel Candès and collaborators. This paper introduces a simple estimation procedure, called Universal Singular Value Thresholding (USVT), that works for any matrix that has ‘a little bit of structure’. Surprisingly, this simple estimator achieves the minimax error rate up to a constant factor. The method is applied to solve problems related to low rank matrix estimation, blockmodels, distance matrix completion, latent space models, positive definite matrix completion, graphon estimation, and generalized Bradley–Terry models for pairwise comparison. 1.
Scaled Gradients on Grassmann Manifolds for Matrix Completion
"... This paper describes gradient methods based on a scaled metric on the Grassmann manifold for lowrank matrix completion. The proposed methods significantly improve canonical gradient methods, especially on illconditioned matrices, while maintaining established global convegence and exact recovery g ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
This paper describes gradient methods based on a scaled metric on the Grassmann manifold for lowrank matrix completion. The proposed methods significantly improve canonical gradient methods, especially on illconditioned matrices, while maintaining established global convegence and exact recovery guarantees. A connection between a form of subspace iteration for matrix completion and the scaled gradient descent procedure is also established. The proposed conjugate gradient method based on the scaled gradient outperforms several existing algorithms for matrix completion and is competitive with recently proposed methods. 1
LOWRANK OPTIMIZATION WITH TRACE NORM PENALTY∗
"... Abstract. The paper addresses the problem of lowrank trace norm minimization. We propose an algorithm that alternates between fixedrank optimization and rankone updates. The fixedrank optimization is characterized by an efficient factorization that makes the trace norm differentiable in the sear ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
(Show Context)
Abstract. The paper addresses the problem of lowrank trace norm minimization. We propose an algorithm that alternates between fixedrank optimization and rankone updates. The fixedrank optimization is characterized by an efficient factorization that makes the trace norm differentiable in the search space and the computation of duality gap numerically tractable. The search space is nonlinear but is equipped with a Riemannian structure that leads to efficient computations. We present a secondorder trustregion algorithm with a guaranteed quadratic rate of convergence. Overall, the proposed optimization scheme converges superlinearly to the global solution while maintaining complexity that is linear in the number of rows and columns of the matrix. To compute a set of solutions efficiently for a grid of regularization parameters we propose a predictorcorrector approach that outperforms the naive warmrestart approach on the fixedrank quotient manifold. The performance of the proposed algorithm is illustrated on problems of lowrank matrix completion and multivariate linear regression.
Iterative reweighted algorithms for matrix rank minimization
 Journal of Machine Learning Research
"... The problem of minimizing the rank of a matrix subject to affine constraints has many applications in machine learning, and is known to be NPhard. One of the tractable relaxations proposed for this problem is nuclear norm (or trace norm) minimization of the matrix, which is guaranteed to find the m ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
(Show Context)
The problem of minimizing the rank of a matrix subject to affine constraints has many applications in machine learning, and is known to be NPhard. One of the tractable relaxations proposed for this problem is nuclear norm (or trace norm) minimization of the matrix, which is guaranteed to find the minimum rank matrix under suitable assumptions. In this paper, we propose a family of Iterative Reweighted Least Squares algorithms IRLSp (with 0 ≤ p ≤ 1), as a computationally efficient way to improve over the performance of nuclear norm minimization. The algorithms can be viewed as (locally) minimizing certain smooth approximations to the rank function. When p = 1, we give theoretical guarantees similar to those for nuclear norm minimization, i.e., recovery of lowrank matrices under certain assumptions on the operator defining the constraints. For p < 1, IRLSp shows better empirical performance in terms of recovering lowrank matrices than nuclear norm minimization. We provide an efficient implementation for IRLSp, and also present a related family of algorithms, sIRLSp. These algorithms exhibit competitive run times and improved recovery when compared to existing algorithms for random instances of the matrix completion problem, as well as on the MovieLens movie recommendation data set.
Learning with the Weighted Tracenorm under Arbitrary Sampling Distributions
"... We provide rigorous guarantees on learning with the weighted tracenorm under arbitrary sampling distributions. We show that the standard weightedtrace norm might fail when the sampling distribution is not a product distribution (i.e. when row and column indexes are not selected independently), pre ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
(Show Context)
We provide rigorous guarantees on learning with the weighted tracenorm under arbitrary sampling distributions. We show that the standard weightedtrace norm might fail when the sampling distribution is not a product distribution (i.e. when row and column indexes are not selected independently), present a corrected variant for which we establish strong learning guarantees, and demonstrate that it works better in practice. We provide guarantees when weighting by either the true or empirical sampling distribution, and suggest that even if the true distribution is known (or is uniform), weighting by the empirical distribution may be beneficial. 1