Results 1  10
of
10
Global convergence of stochastic gradient descent for some nonconvex matrix problems. arXiv preprint arXiv:1411.1134,
, 2014
"... Abstract Stochastic gradient descent (SGD) on a lowrank factorization ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract Stochastic gradient descent (SGD) on a lowrank factorization
Lowrank Solutions of Linear Matrix Equations via Procrustes Flow
, 2015
"... In this paper we study the problem of recovering an lowrank positive semidefinite matrix from linear measurements. Our algorithm, which we call Procrustes Flow, starts from an initial estimate obtained by a thresholding scheme followed by gradient descent on a nonconvex objective. We show that as ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
In this paper we study the problem of recovering an lowrank positive semidefinite matrix from linear measurements. Our algorithm, which we call Procrustes Flow, starts from an initial estimate obtained by a thresholding scheme followed by gradient descent on a nonconvex objective. We show that as long as the measurements obey a standard restricted isometry property, our algorithm converges to the unknown matrix at a geometric rate. In the case of Gaussian measurements, such convergence occurs for a n×n matrix of rank r when the number of measurements exceeds a constant times nr. 1
Solving Random Quadratic Systems of Equations is nearly as easy as . . .
, 2015
"... We consider the fundamental problem of solving quadratic systems of equations in n variables, where yi = 〈ai,x〉2, i = 1,...,m and x ∈ Rn is unknown. We propose a novel method, which starting with an initial guess computed by means of a spectral method, proceeds by minimizing a nonconvex functional ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We consider the fundamental problem of solving quadratic systems of equations in n variables, where yi = 〈ai,x〉2, i = 1,...,m and x ∈ Rn is unknown. We propose a novel method, which starting with an initial guess computed by means of a spectral method, proceeds by minimizing a nonconvex functional as in the Wirtinger flow approach [11]. There are several key distinguishing features, most notably, a distinct objective functional and novel update rules, which operate in an adaptive fashion and drop terms bearing too much influence on the search direction. These careful selection rules provide a tighter initial guess, better descent directions, and thus enhanced practical performance. On the theoretical side, we prove that for certain unstructured models of quadratic systems, our algorithms return the correct solution in linear time, i.e. in time proportional to reading the data {ai} and {yi} as soon as the ratio m/n between the number of equations and unknowns exceeds a fixed numerical constant. We extend the theory to deal with noisy systems in which we only have yi ≈ 〈ai,x〉2 and prove that our algorithms achieve a statistical accuracy, which is nearly unimprovable. We complement our theoretical study with numerical examples showing that solving random quadratic systems is both computationally and statistically not much harder than solving linear systems of the same size—hence the title of this paper. For instance, we
Provable Efficient Online Matrix Completion via Nonconvex Stochastic Gradient Descent
"... Abstract Matrix completion, where we wish to recover a low rank matrix by observing a few entries from it, is a widely studied problem in both theory and practice with wide applications. Most of the provable algorithms so far on this problem have been restricted to the offline setting where they pr ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Matrix completion, where we wish to recover a low rank matrix by observing a few entries from it, is a widely studied problem in both theory and practice with wide applications. Most of the provable algorithms so far on this problem have been restricted to the offline setting where they provide an estimate of the unknown matrix using all observations simultaneously. However, in many applications, the online version, where we observe one entry at a time and dynamically update our estimate, is more appealing. While existing algorithms are efficient for the offline setting, they could be highly inefficient for the online setting. In this paper, we propose the first provable, efficient online algorithm for matrix completion. Our algorithm starts from an initial estimate of the matrix and then performs nonconvex stochastic gradient descent (SGD). After every observation, it performs a fast update involving only one row of two tall matrices, giving near linear total runtime. Our algorithm can be naturally used in the offline setting as well, where it gives competitive sample complexity and runtime to state of the art algorithms. Our proofs introduce a general framework to show that SGD updates tend to stay away from saddle surfaces and could be of broader interests to other nonconvex problems.
Dynamic matrix recovery from incomplete observations under an exact lowrank constraint
"... Abstract Lowrank matrix factorizations arise in a wide variety of applications including recommendation systems, topic models, and source separation, to name just a few. In these and many other applications, it has been widely noted that by incorporating temporal information and allowing for the ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Lowrank matrix factorizations arise in a wide variety of applications including recommendation systems, topic models, and source separation, to name just a few. In these and many other applications, it has been widely noted that by incorporating temporal information and allowing for the possibility of timevarying models, significant improvements are possible in practice. However, despite the reported superior empirical performance of these dynamic models over their static counterparts, there is limited theoretical justification for introducing these more complex models. In this paper we aim to address this gap by studying the problem of recovering a dynamically evolving lowrank matrix from incomplete observations. First, we propose the locally weighted matrix smoothing (LOWEMS) framework as one possible approach to dynamic matrix recovery. We then establish error bounds for LOWEMS in both the matrix sensing and matrix completion observation models. Our results quantify the potential benefits of exploiting dynamic constraints both in terms of recovery accuracy and sample complexity. To illustrate these benefits we provide both synthetic and realworld experimental results.
Recovery guarantee of weighted lowrank approximation via alternating minimization
"... Abstract Many applications require recovering a ground truth lowrank matrix from noisy observations of the entries, which in practice is typically formulated as a weighted lowrank approximation problem and solved by nonconvex optimization heuristics such as alternating minimization. In this pape ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Many applications require recovering a ground truth lowrank matrix from noisy observations of the entries, which in practice is typically formulated as a weighted lowrank approximation problem and solved by nonconvex optimization heuristics such as alternating minimization. In this paper, we provide provable recovery guarantee of weighted lowrank via a simple alternating minimization algorithm. In particular, for a natural class of matrices and weights and without any assumption on the noise, we bound the spectral norm of the difference between the recovered matrix and the ground truth, by the spectral norm of the weighted noise plus an additive error term that decreases exponentially with the number of rounds of alternating minimization, from either initialization by SVD or, more importantly, random initialization. These provide the first theoretical results for weighted lowrank approximation via alternating minimization with nonbinary deterministic weights, significantly generalizing those for matrix completion, the special case with binary weights, since our assumptions are similar or weaker than those made in existing works. Furthermore, this is achieved by a very simple algorithm that improves the vanilla alternating minimization with a simple clipping step.
Fast Algorithms for Robust PCA via Gradient Descent
"... Abstract We consider the problem of Robust PCA in the fully and partially observed settings. Without corruptions, this is the wellknown matrix completion problem. From a statistical standpoint this problem has been recently wellstudied, and conditions on when recovery is possible (how many observ ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We consider the problem of Robust PCA in the fully and partially observed settings. Without corruptions, this is the wellknown matrix completion problem. From a statistical standpoint this problem has been recently wellstudied, and conditions on when recovery is possible (how many observations do we need, how many corruptions can we tolerate) via polynomialtime algorithms is by now understood. This paper presents and analyzes a nonconvex optimization approach that greatly reduces the computational complexity of the above problems, compared to the best available algorithms. In particular, in the fully observed case, with r denoting rank and d dimension, we reduce the complexity from O(r 2 d 2 log(1/ε)) to O(rd 2 log(1/ε)) a big savings when the rank is big. For the partially observed case, we show the complexity of our algorithm is no more than O(r 4 d log d log(1/ε)). Not only is this the bestknown runtime for a provable algorithm under partial observation, but in the setting where r is small compared to d, it also allows for nearlinearind runtime that can be exploited in the fullyobserved case as well, by simply running our algorithm on a subset of the observations.
A Geometric Analysis of Phase Retrieval
"... Abstract Can we recover a complex signal from its Fourier magnitudes? More generally, given a set of m measurements, y k = a * k x for k = 1, . . . , m, is it possible to recover x ∈ C n (i.e., lengthn complex vector)? This generalized phase retrieval (GPR) problem is a fundamental task in vario ..."
Abstract
 Add to MetaCart
Abstract Can we recover a complex signal from its Fourier magnitudes? More generally, given a set of m measurements, y k = a * k x for k = 1, . . . , m, is it possible to recover x ∈ C n (i.e., lengthn complex vector)? This generalized phase retrieval (GPR) problem is a fundamental task in various disciplines, and has been the subject of much recent investigation. Natural nonconvex heuristics often work remarkably well for GPR in practice, but lack clear theoretical explanations. In this paper, we take a step towards bridging this gap. We prove that when the measurement vectors a k 's are generic (i.i.d. complex Gaussian) and the number of measurements is large enough (m ≥ Cn log 3 n), with high probability, a natural leastsquares formulation for GPR has the following benign geometric structure: (1) there are no spurious local minimizers, and all global minimizers are equal to the target signal x, up to a global phase; and (2) the objective function has a negative curvature around each saddle point. This structure allows a number of iterative optimization methods to efficiently find a global minimizer, without special initialization. To corroborate the claim, we describe and analyze a secondorder trustregion algorithm.
Efficient matrix completion for seismic data
, 2015
"... Despite recent developments in improved acquisition, seismic data often remains undersampled along source and receiver coordinates, resulting in incomplete data for key applications such as migration and multiple prediction. We interpret the missingtrace interpolation problem in the context of matr ..."
Abstract
 Add to MetaCart
(Show Context)
Despite recent developments in improved acquisition, seismic data often remains undersampled along source and receiver coordinates, resulting in incomplete data for key applications such as migration and multiple prediction. We interpret the missingtrace interpolation problem in the context of matrix completion and outline three practical principles for using lowrank optimization techniques to recover seismic data. Specifically, we strive for recovery scenarios wherein the original signal is low rank and the subsampling scheme increases the singular values of the matrix. We employ an optimization program that restores this low rank structure to recover the full volume. Omitting one or more of these principles can lead to poor interpolation results, as we show experimentally. In light of this theory, we compensate for the highrank behavior of data in the sourcereceiver domain by employing the midpointoffset transformation for 2D data and a sourcereceiver permutation for 3D data to reduce the overall singular values. Simultaneously, in order to work with computationally feasible algorithms for large scale data, we use a factorizationbased approach to matrix completion, which significantly speeds up the computations compared to repeated singular value decompositions without reducing the recovery quality.
Nonconvex Low Rank Matrix Factorization via Inexact First Order Oracle
"... We study the low rank matrix factorization problem via nonconvex optimization. Compared with the convex relaxation approach, nonconvex optimization exhibits superior empirical performance for large scale low rank matrix estimation. However, the understanding of its theoretical guarantees is limite ..."
Abstract
 Add to MetaCart
(Show Context)
We study the low rank matrix factorization problem via nonconvex optimization. Compared with the convex relaxation approach, nonconvex optimization exhibits superior empirical performance for large scale low rank matrix estimation. However, the understanding of its theoretical guarantees is limited. To bridge this gap, we exploit the notion of inexact first order oracle, which naturally appears in low rank matrix factorization problems such as matrix sensing and completion. Particularly, our analysis shows that a broad class of nonconvex optimization algorithms, including alternating minimization and gradienttype methods, can be treated as solving two sequences of convex optimization algorithms using inexact first order oracle. Thus we can show that these algorithms converge geometrically to the global optima and recover the true low rank matrices under suitable conditions. Numerical results are provided to support our theory. 1