Results 1  10
of
56
Phase Retrieval via Wirtinger Flow: Theory and Algorithms
, 2014
"... We study the problem of recovering the phase from magnitude measurements; specifically, we wish to reconstruct a complexvalued signal x ∈ Cn about which we have phaseless samples of the form yr = ∣⟨ar,x⟩∣2, r = 1,...,m (knowledge of the phase of these samples would yield a linear system). This pape ..."
Abstract

Cited by 24 (4 self)
 Add to MetaCart
We study the problem of recovering the phase from magnitude measurements; specifically, we wish to reconstruct a complexvalued signal x ∈ Cn about which we have phaseless samples of the form yr = ∣⟨ar,x⟩∣2, r = 1,...,m (knowledge of the phase of these samples would yield a linear system). This paper develops a nonconvex formulation of the phase retrieval problem as well as a concrete solution algorithm. In a nutshell, this algorithm starts with a careful initialization obtained by means of a spectral method, and then refines this initial estimate by iteratively applying novel update rules, which have low computational complexity, much like in a gradient descent scheme. The main contribution is that this algorithm is shown to rigorously allow the exact retrieval of phase information from a nearly minimal number of random measurements. Indeed, the sequence of successive iterates provably converges to the solution at a geometric rate so that the proposed scheme is efficient both in terms of computational and data resources. In theory, a variation on this scheme leads to a nearlinear time algorithm for a physically realizable model based on coded diffraction patterns. We illustrate the effectiveness of our methods with various experiments on image data. Underlying our analysis are insights for the analysis of nonconvex optimization schemes that may have implications for computational problems beyond phase retrieval.
Phase retrieval using alternating minimization
 In NIPS
, 2013
"... Phase retrieval problems involve solving linear equations, but with missing sign (or phase, for complex numbers) information. Over the last two decades, a popular generic empirical approach to the many variants of this problem has been one of alternating minimization; i.e. alternating between estima ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
(Show Context)
Phase retrieval problems involve solving linear equations, but with missing sign (or phase, for complex numbers) information. Over the last two decades, a popular generic empirical approach to the many variants of this problem has been one of alternating minimization; i.e. alternating between estimating the missing phase information, and the candidate solution. In this paper, we show that a simple alternating minimization algorithm geometrically converges to the solution of one such problem – finding a vector x from y,A, where y = ATx  and z  denotes a vector of elementwise magnitudes of z – under the assumption that A is Gaussian. Empirically, our algorithm performs similar to recently proposed convex techniques for this variant (which are based on “lifting ” to a convex matrix problem) in sample complexity and robustness to noise. However, our algorithm is much more efficient and can scale to large problems. Analytically, we show geometric convergence to the solution, and sample complexity that is off by log factors from obvious lower bounds. We also establish close to optimal scaling for the case when the unknown vector is sparse. Our work represents the only known theoretical guarantee for alternating minimization for any variant of phase retrieval problems in the nonconvex setting. 1
New Algorithms for Learning Incoherent and Overcomplete Dictionaries
, 2014
"... In sparse recovery we are given a matrix A ∈ Rn×m (“the dictionary”) and a vector of the form AX where X is sparse, and the goal is to recover X. This is a central notion in signal processing, statistics and machine learning. But in applications such as sparse coding, edge detection, compression an ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
(Show Context)
In sparse recovery we are given a matrix A ∈ Rn×m (“the dictionary”) and a vector of the form AX where X is sparse, and the goal is to recover X. This is a central notion in signal processing, statistics and machine learning. But in applications such as sparse coding, edge detection, compression and super resolution, the dictionary A is unknown and has to be learned from random examples of the form Y = AX where X is drawn from an appropriate distribution — this is the dictionary learning problem. In most settings, A is overcomplete: it has more columns than rows. This paper presents a polynomialtime algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman et al. (2012) who gave an algorithm for the undercomplete case, which is rarely the case in applications. Our algorithm applies to incoherent dictionaries which have been a central object of study since they were introduced in seminal work of Donoho and Huo (1999). In particular, a dictionary is µincoherent if each pair of columns has inner product at most µ/ n. The algorithm makes natural stochastic assumptions about the unknown sparse vector X, which can contain k ≤ cmin(√n/µ log n,m1/2−η) nonzero entries (for any η> 0). This is close to the best k allowable by the best sparse recovery algorithms even if one knows the dictionary A exactly. Moreover, both the running time and sample complexity depend on log 1/, where is the target accuracy, and so our algorithms converge very quickly to the true dictionary. Our algorithm can also tolerate substantial amounts of noise provided it is incoherent with respect to the dictionary (e.g., Gaussian). In the noisy setting, our running time and sample complexity depend polynomially on 1/, and this is necessary.
Statistical guarantees for the EM algorithm: From population to samplebased analysis
, 2014
"... We develop a general framework for proving rigorous guarantees on the performance of the EM algorithm and a variant known as gradient EM. Our analysis is divided into two parts: a treatment of these algorithms at the population level (in the limit of infinite data), followed by results that apply to ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
We develop a general framework for proving rigorous guarantees on the performance of the EM algorithm and a variant known as gradient EM. Our analysis is divided into two parts: a treatment of these algorithms at the population level (in the limit of infinite data), followed by results that apply to updates based on a finite set of samples. First, we characterize the domain of attraction of any global maximizer of the population likelihood. This characterization is based on a novel view of the EM updates as a perturbed form of likelihood ascent, or in parallel, of the gradient EM updates as a perturbed form of standard gradient ascent. Leveraging this characterization, we then provide nonasymptotic guarantees on the EM and gradient EM algorithms when applied to a finite set of samples. We develop consequences of our general theory for three canonical examples of incompletedata problems: mixture of Gaussians, mixture of regressions, and linear regression with covariates missing completely at random. In each case, our theory guarantees that with a suitable initialization, a relatively small number of EM (or gradient EM) steps will yield (with high probability) an estimate that is within statistical error of the MLE. We provide simulations to confirm this theoretically predicted behavior. 1
Near optimal compressed sensing of sparse rankone matrices via sparse power factorization. arXiv preprint arXiv:1312.0525
, 2013
"... ar ..."
(Show Context)
Matrix completion and lowrank svd via fast alternating least squares,” arXiv preprint arXiv:1410.2596
, 2014
"... Abstract The matrixcompletion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition. Two popular approaches for solving the problem are nuclearnormregularized matrix approximation ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Abstract The matrixcompletion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition. Two popular approaches for solving the problem are nuclearnormregularized matrix approximation
Guaranteed NonOrthogonal Tensor Decomposition via Alternating Rank1 Updates. arXiv preprint arXiv:1402.5180,
, 2014
"... Abstract A simple alternating rank1 update procedure is considered for CP tensor decomposition. Local convergence guarantees are established for third order tensors of rank k in d dimensions, when k = o(d 1.5 ) and the tensor components are incoherent. We strengthen the results to global converge ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
Abstract A simple alternating rank1 update procedure is considered for CP tensor decomposition. Local convergence guarantees are established for third order tensors of rank k in d dimensions, when k = o(d 1.5 ) and the tensor components are incoherent. We strengthen the results to global convergence guarantees when k = O(d) through a simple initialization procedure based on rank1 singular value decomposition of random tensor slices. Our tight perturbation analysis leads to efficient sample guarantees for unsupervised learning of discrete multiview mixtures when k = O(d), where k is the number of mixture components and d is the observed dimension. For learning overcomplete decompositions (k = ω(d)), we prove that having an extremely small number of labeled samples, scaling as polylog(k) for each label, under the semisupervised setting (where the label corresponds to the choice variable in the mixture model) leads to global convergence guarantees for learning mixture models.
Global convergence of stochastic gradient descent for some nonconvex matrix problems. arXiv preprint arXiv:1411.1134,
, 2014
"... Abstract Stochastic gradient descent (SGD) on a lowrank factorization ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract Stochastic gradient descent (SGD) on a lowrank factorization
Exponential Family Matrix Completion under Structural Constraints
"... We consider the matrix completion problem of recovering a structured matrix from noisy and partial measurements. Recent works have proposed tractable estimators with strong statistical guarantees for the case where the underlying matrix is low–rank, and the measurements consist of a subset, either ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
We consider the matrix completion problem of recovering a structured matrix from noisy and partial measurements. Recent works have proposed tractable estimators with strong statistical guarantees for the case where the underlying matrix is low–rank, and the measurements consist of a subset, either of the exact individual entries, or of the entries perturbed by additive Gaussian noise, which is thus implicitly suited for thin– tailed continuous data. Arguably, common applications of matrix completion require estimators for (a) heterogeneous data–types, such as skewed–continuous, count, binary, etc., (b) for heterogeneous noise models (beyond Gaussian), which capture varied uncertainty in the measurements, and (c) heterogeneous structural constraints beyond low–rank, such as block–sparsity, or a superposition structure of low–rank plus elementwise sparseness, among others. In this paper, we provide a vastly unified framework for generalized matrix completion by considering a matrix completion setting wherein the matrix entries are sampled from any member of the rich family of exponential family distributions; and impose general structural constraints on the underlying matrix, as captured by a general regularizer R(.). We propose a simple convex regularized M–estimator for the generalized framework, and provide a unified and novel statistical analysis for this general class of estimators. We finally corroborate our theoretical results on simulated datasets.