#### DMCA

## Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization (2007)

### Cached

### Download Links

- [www.yaroslavvb.com]
- [www.ist.caltech.edu]
- [www.cs.wisc.edu]
- [www.cs.wisc.edu]
- [pages.cs.wisc.edu]
- [pages.cs.wisc.edu]
- [pages.cs.wisc.edu]
- [faculty.washington.edu]
- [faculty.washington.edu]
- [www.mit.edu]
- [www.mit.edu]
- [arxiv.org]
- [iweb.tntech.edu]
- [arxiv.org]
- [faculty.washington.edu]
- [www.optimization-online.org]

Citations: | 562 - 20 self |

### Citations

7720 |
Topics in Matrix Analysis
- Horn, Johnson
- 1991
(Show Context)
Citation Context ...(2.10) The similarity between (2.9) and (2.10) is particularly transparent if we recall the polar decomposition of a matrix into a product of orthogonal and positive semidefinite matrices (see, e.g., =-=[33]-=-). The “angular” component of the matrix X is exactly given by UV ′ . Thus, these subgradients always have the form of an “angle” (or sign), plus possibly a contraction in an orthogonal direction if t... |

7448 | Convex Optimization
- Boyd, Vanderberghe
- 2004
(Show Context)
Citation Context ...main convex optimization problem studied in this article. Our discussion of matrix norms and their connections to semidefinite programming and convex optimization will mostly follow the discussion in =-=[9, 37, 79]-=- where extensive lists of references are provided. Matrix vs. vector norms The three vector norms that play significant roles in the compressed sensing framework are the ℓ1, ℓ2, and ℓ∞ norms, denoted ... |

4206 | Regression shrinkage and selection via the lasso
- Tibshirani
- 1996
(Show Context)
Citation Context ...g to deconvolve seismic activity [21, 74]. Since then, ℓ1 minimization has been applied to a variety of cardinality minimization problems including image denoising [65], model selection in statistics =-=[76]-=-, sparse approximation [20], portfolio optimization with fixed transaction costs [49], design of sparse interconnect wiring in circuits [80], and design of sparse feedback gains in control systems [43... |

3609 | Compressed sensing
- Donoho
- 2006
(Show Context)
Citation Context ...the ℓ1 norm) of the diagonal elements. Minimization of the ℓ1 norm is a well-known heuristic for the cardinality minimization problem, and stunning results pioneered by Candès and Tao [10] and Donoho =-=[17]-=- have characterized a vast set of instances for which the ℓ1 heuristic can be a priori guaranteed to yield the optimal solution. These techniques provide the foundations of the recently developed comp... |

2717 | Atomic Decomposition by Basis Pursuit
- Chen, Donoho, et al.
- 1998
(Show Context)
Citation Context ...ivity [21, 74]. Since then, ℓ1 minimization has been applied to a variety of cardinality minimization problems including image denoising [65], model selection in statistics [76], sparse approximation =-=[20]-=-, portfolio optimization with fixed transaction costs [49], design of sparse interconnect wiring in circuits [80], and design of sparse feedback gains in control systems [43]. Recently, stunning resul... |

2621 | Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information
- Candès, Romberg, et al.
- 2006
(Show Context)
Citation Context ...the foundations of the recently developed compressed sensing or compressive sampling frameworks for measurement, coding, and signal estimation. As has been shown by a number of research groups (e.g., =-=[4, 12, 13, 14]-=-), the ℓ1 heuristic for cardinality minimization provably recovers the sparsest solution whenever the sensing matrix has certain “basis incoherence” properties, and in particular, when it is randomly ... |

2461 |
A global geometric framework for nonlinear dimensionality reduction.
- Tenenbaum, Silva, et al.
- 2000
(Show Context)
Citation Context ...ld learning, one may be given high dimensional data with low-dimensional structure that can be recovered by searching for a low-dimensional embedding of the data preserving local distance information =-=[64, 75]-=-. A symmetric matrix D ∈ Sn is called a Euclidean distance matrix (EDM) if there exist points x1, . . . , xn in Rd such that Dij = ‖xi − xj‖2 . Let V := In − 1 n11T be the projection matrix onto the h... |

2414 | Nonlinear dimensionality reduction by locally linear embedding
- Roweis, Saul
(Show Context)
Citation Context ...ld learning, one may be given high dimensional data with low-dimensional structure that can be recovered by searching for a low-dimensional embedding of the data preserving local distance information =-=[64, 75]-=-. A symmetric matrix D ∈ Sn is called a Euclidean distance matrix (EDM) if there exist points x1, . . . , xn in Rd such that Dij = ‖xi − xj‖2 . Let V := In − 1 n11T be the projection matrix onto the h... |

2267 | Nonlinear total variation based noise removal algorithms
- Rudin, Osher, et al.
- 1992
(Show Context)
Citation Context ...the 1970s by geophysicists attempting to deconvolve seismic activity [21, 74]. Since then, ℓ1 minimization has been applied to a variety of cardinality minimization problems including image denoising =-=[65]-=-, model selection in statistics [76], sparse approximation [20], portfolio optimization with fixed transaction costs [49], design of sparse interconnect wiring in circuits [80], and design of sparse f... |

1666 | Matching pursuits with timefrequency dictionaries
- Mallat, Zhang
- 1993
(Show Context)
Citation Context ...and is known to be NP-hard [56]. Just as in the case of rank minimization, a variety of heuristic algorithms have been proposed to solve cardinality minimization problems including Projection Pursuit =-=[40, 51, 62]-=- and Orthogonal Matching Pursuit [19, 24, 61]. For diagonal matrices, the sum of the singular values is equal to the sum of the absolute values (i.e., the ℓ1 norm) of the diagonal elements. Minimizati... |

1432 | Compressive sampling
- Candès
- 2006
(Show Context)
Citation Context ...e values (i.e., the ℓ1 norm) of the diagonal elements. Minimization of the ℓ1 norm is a well-known heuristic for the cardinality minimization problem, and stunning results pioneered by Candès and Tao =-=[10]-=- and Donoho [17] have characterized a vast set of instances for which the ℓ1 heuristic can be a priori guaranteed to yield the optimal solution. These techniques provide the foundations of the recentl... |

1398 | Decoding by linear programming
- Candès, Tao
(Show Context)
Citation Context ...the foundations of the recently developed compressed sensing or compressive sampling frameworks for measurement, coding, and signal estimation. As has been shown by a number of research groups (e.g., =-=[4, 12, 13, 14]-=-), the ℓ1 heuristic for cardinality minimization provably recovers the sparsest solution whenever the sensing matrix has certain “basis incoherence” properties, and in particular, when it is randomly ... |

1389 | Stable signal recovery from incomplete and inaccurate measurements
- Candès, Romberg, et al.
(Show Context)
Citation Context ...the foundations of the recently developed compressed sensing or compressive sampling frameworks for measurement, coding, and signal estimation. As has been shown by a number of research groups (e.g., =-=[4, 12, 13, 14]-=-), the ℓ1 heuristic for cardinality minimization provably recovers the sparsest solution whenever the sensing matrix has certain “basis incoherence” properties, and in particular, when it is randomly ... |

1364 | Using SeDuMi 1.02, a Matlab toolbox for optimization over symmetric cones
- Sturm
- 1999
(Show Context)
Citation Context ... the linear mapping A, this may entail solving a potentially large, dense linear system. 20sIf the matrix dimensions n and m are not too large, then any good interior point SDP solver, such as SeDuMi =-=[48]-=- or SDPT3 [51], will quickly produce accurate solutions. In fact, as we will see in the next section, problems with n and m around 50 can be solved to machine precision in minutes on a desktop compute... |

1325 | Least angle regression
- Efron, Hastie, et al.
- 2004
(Show Context)
Citation Context ...terest to investigate the possible adaptation of some of the successful pathfollowing approaches in traditional ℓ1/cardinality minimization, such as the Homotopy [40] or LARS (least angle regression) =-=[21]-=-. This may be not be completely straightforward, since the efficiency of many of these methods often relies explicitly on the polyhedral structure of the feasible set of the ℓ1 norm problem. Geometric... |

1222 |
Nonlinear Programming. Athena Scientific, 2 edition
- Bertsekas
- 1999
(Show Context)
Citation Context ...on problem is equal to the maximum of the second. This notion of duality generalizes the well-known case of linear programming, and is in fact applicable to all convex optimization problems; see e.g. =-=[8, 9]-=-. The formulation (2.7) is valid for any norm minimization problem, by replacing the norms appearing above by any dual pair of norms. In particular, if we replace the nuclear norm with the ℓ1 norm and... |

1101 | Semidefinite programming
- Vandenberghe, Boyd
- 1994
(Show Context)
Citation Context ...e minimization of the nuclear norm under affine equality constraints, the main convex optimization problem studied in this article. Our discussion of matrix norms will mostly follow the discussion in =-=[27, 53]-=- where extensive lists of references are provided. Matrix vs. Vector Norms The three vector norms that play significant roles in the compressed sensing framework are the ℓ1, ℓ2, and ℓ∞ norms, denoted ... |

804 |
Constrained Optimization and Lagrange Multipliers Method
- BERTSEKAS
- 1982
(Show Context)
Citation Context ... and Y = LR ′ . We summarize below the details of this approach. The algorithm employed is called the method of multipliers, a standard approach for solving equality constrained optimization problems =-=[6]-=-. The method of multipliers works with an augmented Lagrangian for (5.1) La(L, R; y, σ) := 1 2 (||L||2 F + ||R||2 F ) − y′ (A(LR ′ ) − b) + σ 2 ||A(LR′ ) − b|| 2 , (5.2) where the yi are arbitrarily s... |

632 | Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition
- Pati, Rezaiifar, et al.
- 1993
(Show Context)
Citation Context ...he case of rank minimization, a variety of heuristic algorithms have been proposed to solve cardinality minimization problems including Projection Pursuit [40, 51, 62] and Orthogonal Matching Pursuit =-=[19, 24, 61]-=-. For diagonal matrices, the sum of the singular values is equal to the sum of the absolute values (i.e., the ℓ1 norm) of the diagonal elements. Minimization of the ℓ1 norm is a well-known heuristic f... |

625 | A simple proof of the Restricted Isometry Property for random matrices
- Baraniuk, Davenport, et al.
(Show Context)
Citation Context |

556 |
Sparse approximate solutions to linear systems
- Natarajan
- 1995
(Show Context)
Citation Context ...parsest vector in an affine subspace. This problem is commonly referred to as cardinality minimization, since we seek the vector whose support has the smallest cardinality, and is known to be NP-hard =-=[39]-=-. For diagonal matrices, the sum of the singular values is equal to the sum of the absolute values (i.e., the ℓ1 norm) of the diagonal elements. Minimization of the ℓ1 norm is a well-known heuristic f... |

550 | Projection pursuit regression
- Friedman, StŸtzle
- 1981
(Show Context)
Citation Context ...and is known to be NP-hard [56]. Just as in the case of rank minimization, a variety of heuristic algorithms have been proposed to solve cardinality minimization problems including Projection Pursuit =-=[40, 51, 62]-=- and Orthogonal Matching Pursuit [19, 24, 61]. For diagonal matrices, the sum of the singular values is equal to the sum of the absolute values (i.e., the ℓ1 norm) of the diagonal elements. Minimizati... |

523 | The Geometry of Graphs and Some of its Algorithmic Applications
- Linial, London, et al.
- 1995
(Show Context)
Citation Context ...model for a random process (e.g., factor analysis), a low-order realization of a linear system [28], a low-order controller for a plant [22], or a low-dimensional embedding of data in Euclidean space =-=[34]-=-. ∗ Center for the Mathematics of Information, California Institute of Technology † Control and Dynamical Systems, California Institute of Technology ‡ Laboratory for Information and Decision Systems,... |

455 |
The approximation of one matrix by another of lower rank
- Eckart, Young
- 1936
(Show Context)
Citation Context ...evel of the (i, j) pixel. The best rank-k approximation of M is given by X ∗ := arg min ||M − X||, rank(X)≤k where ||·|| is any unitarily invariant norm. By the classical Eckart-Young-Mirsky theorem (=-=[20, 38]-=-), the optimal approximant is given by a truncated singular value decomposition of M, i.e., if M = UΣV T , then X ∗ = UΣkV T , where the first k diagonal entries of Σk are the largest k singular value... |

361 | Sdpt3 - a MATLAB software package for semidefinite programming
- Toh, Todd, et al.
- 1999
(Show Context)
Citation Context ...pping A, this may entail solving a potentially large, dense linear system. 20sIf the matrix dimensions n and m are not too large, then any good interior point SDP solver, such as SeDuMi [48] or SDPT3 =-=[51]-=-, will quickly produce accurate solutions. In fact, as we will see in the next section, problems with n and m around 50 can be solved to machine precision in minutes on a desktop computer to machine p... |

338 |
Problem complexity and method efficiency in optimization. Wiley-Interscience series in discrete mathematics.
- Nemirovskii, Yudin
- 1983
(Show Context)
Citation Context ...such that limk→∞ sk = 0 and ∑ k>0 sk diverging). More recently, several nonlinear projected subgradient methods, under the rubric of mirror descent, have been developed (e.g., by Nemirovski and Yudin =-=[57]-=-), followed by a subsequent rederivation and analysis by Beck and Teboulle [5]. These algorithms, and their accompanying analysis, provide improved theoretical guarantees and practical performance ove... |

287 |
Matrix Rank Minimization with Applications.
- Fazel
- 2002
(Show Context)
Citation Context ...the general case, a variety of heuristic algorithms based on local optimization, including alternating projections [31] and alternating LMIs [45], have been proposed. A recent heuristic introduced in =-=[27]-=- minimizes the nuclear norm, or the sum of the singular values of the matrix, over the affine subset. The nuclear norm is a convex function, can be optimized efficiently, and is the best convex approx... |

274 | A rank minimization heuristic with application to minimum order system approximation.
- Fazel, Hindi, et al.
- 2001
(Show Context)
Citation Context ...nk of an appropriate matrix. For example, a low-rank matrix could correspond to a low-degree statistical model for a random process (e.g., factor analysis), a low-order realization of a linear system =-=[28]-=-, a low-order controller for a plant [22], or a low-dimensional embedding of data in Euclidean space [34]. ∗ Center for the Mathematics of Information, California Institute of Technology † Control and... |

268 |
Orthogonal least squares methods and their application to non-linear system identification
- Chen, Billings, et al.
- 1989
(Show Context)
Citation Context ...he case of rank minimization, a variety of heuristic algorithms have been proposed to solve cardinality minimization problems including Projection Pursuit [40, 51, 62] and Orthogonal Matching Pursuit =-=[19, 24, 61]-=-. For diagonal matrices, the sum of the singular values is equal to the sum of the absolute values (i.e., the ℓ1 norm) of the diagonal elements. Minimization of the ℓ1 norm is a well-known heuristic f... |

257 |
Database-friendly random projections: Johnson-Lindenstrauss with binary coins
- Achlioptas
(Show Context)
Citation Context ...ll. The exponential bound in (4.2) guarantees union bounds will be small even for rather large sets. This concentration is the typical ingredient required to prove the Johnson-Lindenstrauss Lemma (cf =-=[2, 15]-=-). The majority of nearly isometric random maps are described in terms of random matrices. For a linear map A : R m×n → R p , we can always write its matrix representation as A(X) = A vec(X) , (4.4) w... |

246 | Fast maximum margin matrix factorization for collaborative prediction
- Rennie, Srebro
- 2005
(Show Context)
Citation Context ...x is low-rank is when the columns are i.i.d. samples of a random process with low-rank covariance. Such models are ubiquitous in Factor Analysis, Collaborative Filtering, and Latent Semantic Indexing =-=[42, 47]-=-. In many of these settings, some prior probability distribution (such as a Bernoulli model or uniform distribution on subsets) is assumed to generate the set of available entries. Suppose we are pres... |

244 | A new approach to variable selection in least squares problems.
- Osborne, Presnell, et al.
- 2000
(Show Context)
Citation Context ...emory-efficient. It is also of much interest to investigate the possible adaptation of some of the successful pathfollowing approaches in traditional ℓ1/cardinality minimization, such as the Homotopy =-=[40]-=- or LARS (least angle regression) [21]. This may be not be completely straightforward, since the efficiency of many of these methods often relies explicitly on the polyhedral structure of the feasible... |

244 |
Enhancing sparsity by reweighted l1 minimization
- Candès, Wakin, et al.
- 2008
(Show Context)
Citation Context ... nuclear norm heuristic in practice. This algorithm has been adapted to cardinality minimization, resulting in iterative weighted ℓ1 norm minimization, and has been successful in this setting as well =-=[50, 17]-=-. Another heuristic involves ℓp minimization (locally) with p<1 [18]. These algorithms498 BENJAMIN RECHT, MARYAM FAZEL, AND PABLO A. PARRILO do not currently have any theoretical guarantees, and it b... |

236 | Sparsity and incoherence in compressive sampling
- Candès, Romberg
- 2007
(Show Context)
Citation Context ... approximation is desired. Incoherent Ensembles and Partially Observed Transforms Again, taking our lead from the compressed sensing literature, it would be of great interest to extend the results of =-=[11]-=- to low-rank recovery. In this work, the authors show that partially observed unitary transformations of sparse vectors can be used to recover the sparse vector using ℓ1 minimization. There are many p... |

208 |
Local operator theory, random matrices and Banach spaces.
- Davidson, Szarek
- 2001
(Show Context)
Citation Context ...or D sufficiently large [56]. El Karoui uses this result to prove the concentration inequality (4.3) for all such distributions [23]. The result for Gaussians is rather tight with γ = 1/2 (see, e.g., =-=[16]-=-). Finally, note that a random projection also obeys all of the necessary concentration inequalities. Indeed, since the norm of a random projection is exactly � D/p, (4.3) holds trivially. The concent... |

195 | Sparse nonnegative solutions of underdetermined linear equations by linear programming
- Donoho, Tanner
(Show Context)
Citation Context ...imensional settings. Intriguingly, Figure 4 also demonstrates a “phase transition” between perfect recovery and failure. As observed in several recent papers by Donoho and his collaborators (See e.g. =-=[18, 19]-=-), the random sparsity recovery problem has two distinct connected regions of parameter space: one where the sparsity pattern is perfectly recovered, and one where no sparse solution is found. Not sur... |

187 | Exact reconstruction of sparse signals via nonconvex minimization
- Chartrand
(Show Context)
Citation Context ...cardinality minimization, resulting in iterative weighted ℓ1 norm minimization, and has been successful in this setting as well [50, 17]. Another heuristic involves ℓp minimization (locally) with p<1 =-=[18]-=-. These algorithms498 BENJAMIN RECHT, MARYAM FAZEL, AND PABLO A. PARRILO do not currently have any theoretical guarantees, and it bears investigation if the probabilistic analysis developed in this p... |

183 |
Mathematical Control Theory.
- Sontag
- 1998
(Show Context)
Citation Context ...ere h = [h(0), . . . , h(N)] ′ , and Aij = ai(N − j). From linear system theory, the order of the minimal realization for such a system is given by the rank of the following Hankel matrix (see, e.g., =-=[29, 46]-=-) ⎡ ⎢ hank(h) := ⎢ ⎣ h(0) h(1) . h(1) h(2) . · · · · · · ⎤ h(N) h(N + 1) ⎥ . ⎦ h(N) h(N + 1) · · · h(2N) . 3sTherefore the problem can be expressed as minimize rank(hank(h)) subject to Ah = y where th... |

160 |
Mirror descent and nonlinear projected subgradient methods for convex optimization.
- Beck, Teboulle
- 2003
(Show Context)
Citation Context ...ar projected subgradient methods, under the rubric of mirror descent, have been developed (e.g., by Nemirovski and Yudin [57]), followed by a subsequent rederivation and analysis by Beck and Teboulle =-=[5]-=-. These algorithms, and their accompanying analysis, provide improved theoretical guarantees and practical performance over the standard Euclidean projected subgradient method described above. It woul... |

159 | A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization
- Burer, Monteiro
(Show Context)
Citation Context ...imal solution of (2.8) from the solution computed by the method of multipliers. SDPLR and the method of multipliers For general semidefinite programming problems, Burer and Monteiro have developed in =-=[8, 9]-=- a nonlinear programming approach that relies on a low-rank factorization of the matrix decision variable. We will adapt this idea to our problem, to provide a first-order Lagrangian minimization algo... |

137 |
Robust modeling with erratic data:
- Claebout, Muir
- 1973
(Show Context)
Citation Context ...nal elements. Minimization of the ℓ1 norm is a well-known heuristic for the cardinality minimization problem, employed as early as the 1970s by geophysicists attempting to deconvolve seismic activity =-=[21, 74]-=-. Since then, ℓ1 minimization has been applied to a variety of cardinality minimization problems including image denoising [65], model selection in statistics [76], sparse approximation [20], portfoli... |

136 |
A Unified Algebraic Approach to Linear Control Design. Taylor and Francis,
- Skelton, Iwasaki, et al.
- 1998
(Show Context)
Citation Context ...exponential running times in both theory and practice. For the general case, a variety of heuristic algorithms based on local optimization, including alternating projections [31] and alternating LMIs =-=[45]-=-, have been proposed. A recent heuristic introduced in [27] minimizes the nuclear norm, or the sum of the singular values of the matrix, over the affine subset. The nuclear norm is a convex function, ... |

123 |
Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices.
- Fazel, Hindi, et al.
- 2003
(Show Context)
Citation Context ...ere h = [h(0), . . . , h(N)] ′ , and Aij = ai(N − j). From linear system theory, the order of the minimal realization for such a system is given by the rank of the following Hankel matrix (see, e.g., =-=[29, 46]-=-) ⎡ ⎢ hank(h) := ⎢ ⎣ h(0) h(1) . h(1) h(2) . · · · · · · ⎤ h(N) h(N + 1) ⎥ . ⎦ h(N) h(N + 1) · · · h(2N) . 3sTherefore the problem can be expressed as minimize rank(hank(h)) subject to Ah = y where th... |

121 | An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct
- Dasgupta, Gupta
(Show Context)
Citation Context ...ll. The exponential bound in (4.2) guarantees union bounds will be small even for rather large sets. This concentration is the typical ingredient required to prove the Johnson-Lindenstrauss Lemma (cf =-=[2, 15]-=-). The majority of nearly isometric random maps are described in terms of random matrices. For a linear map A : R m×n → R p , we can always write its matrix representation as A(X) = A vec(X) , (4.4) w... |

115 | Eigenvalue optimization.
- Lewis, Overton
- 1996
(Show Context)
Citation Context ...rix with rank r and let X = UΣV ′ be a singular value decomposition where U ∈ R m×r , V ∈ R n×r and Σ is an r × r diagonal matrix. The subdifferential of the nuclear norm at X is then given by (e.g., =-=[46, 82]-=-) ∂‖X‖∗ = {UV ′ + W : W and X have orthogonal row and column spaces, and ‖W ‖ ≤ 1}. (2.9) For comparison, recall the case of the ℓ1 norm, where T denotes the support of the n-vector x, T c is the comp... |

106 | A cone complementarity linearization algorithm for static output-feedback and related problems,”
- Ghaoui, Oustry, et al.
- 1997
(Show Context)
Citation Context ...For the general case, a variety of heuristic algorithms based on local optimization, including alternating projections and its variations [42, 58], alternating matrix inequalities [68], linearization =-=[31]-=-, and augmented Lagrangian methods [34] have been proposed. A recent heuristic introduced by Fazel et al. in [37, 38] minimizes the nuclear norm, or the sum of the singular values of the matrix, over ... |

100 | Remarks to Maurice Frechet’s article “Sur la definition axiomatique d’une classe d’espace distancies vectoriellement applicable sur l’espace de Hilbert”.
- Schoenberg
- 1935
(Show Context)
Citation Context ...sical result by Schoenberg states that D is a Euclidean distance matrix of n points in R d if and only if Dii = 0, the matrix V DV is negative semidefinite, and rank(V DV ) is less than or equal to d =-=[44]-=-. If the matrix D is known exactly, the corresponding configuration of points (up to a unitary transform) is obtained by simply taking a matrix square root of − 1 2V DV . However, in many cases, only ... |

100 |
A note on the largest eigenvalue of a large dimensional sample covariance matrix
- Bai, Silverstein, et al.
- 1988
(Show Context)
Citation Context ..., and Krishnaiah, who showed that whenever the entries Aij are i.i.d. with zero mean and finite fourth moment, then the maximum singular value of A is almost surely 1 + � D/p for D sufficiently large =-=[56]-=-. El Karoui uses this result to prove the concentration inequality (4.3) for all such distributions [23]. The result for Gaussians is rather tight with γ = 1/2 (see, e.g., [16]). Finally, note that a ... |

99 | Interior-point method for nuclear norm approximation with application to system identification
- Liu, Vandenberghe
(Show Context)
Citation Context ... of problems that can be solved. This can be controlled to some extent by exploiting the problem structure when assembling and solving the Newton system, as in the recent work of Liu and Vandenberghe =-=[48]-=-. Perhaps the most important drawback of the direct SDP approach is that it completely ignores the possibility of efficiently computing the nuclear norm via a singular value decomposition, instead of ... |

95 |
Adaptive time-frequency decompositions’,
- DAVIS, MALLAT, et al.
- 1994
(Show Context)
Citation Context ...he case of rank minimization, a variety of heuristic algorithms have been proposed to solve cardinality minimization problems including Projection Pursuit [40, 51, 62] and Orthogonal Matching Pursuit =-=[19, 24, 61]-=-. For diagonal matrices, the sum of the singular values is equal to the sum of the absolute values (i.e., the ℓ1 norm) of the diagonal elements. Minimization of the ℓ1 norm is a well-known heuristic f... |

90 |
Characterization of the subdifferential of some matrix norms, Linear Algebra and its Applications
- Watson
- 1992
(Show Context)
Citation Context ...ith rank r and let X = UΣV ′ be a singular value decomposition where U ∈ R m×r , V ∈ R n×r and Σ is an r × r diagonal matrix. The subdifferential of the nuclear norm at X is then given by (see, e.g., =-=[55]-=-) ∂�X�∗ = {UV ′ + W : W and X have orthogonal row and column spaces, and �W � ≤ 1}. (2.9) 10 (2.7) (2.8)sFor comparison, recall the case of the ℓ1 norm, where T denotes the support of the n-vector x, ... |

89 | L.: Handbook of Semidefinite Programming
- Wolkowicz, Saigal, et al.
- 2000
(Show Context)
Citation Context ...rior point condition since both (2.5) and (2.6) admit strictly feasible solutions. Interested readers can find an in-depth discussion of Slater conditions for semidefinite programming in Chapter 4 of =-=[83]-=-. ] ′ ≽ 0. 8Convex envelopes of rank and cardinality functions Let C be a given convex set. The convex envelope of a (possibly nonconvex) function f : C → R is defined as the largest convex function ... |

87 |
Symmetric gauge functions and unitarily invariant norms
- Mirsky
- 1960
(Show Context)
Citation Context ...evel of the (i, j) pixel. The best rank-k approximation of M is given by X ∗ := arg min ||M − X||, rank(X)≤k where ||·|| is any unitarily invariant norm. By the classical Eckart-Young-Mirsky theorem (=-=[20, 38]-=-), the optimal approximant is given by a truncated singular value decomposition of M, i.e., if M = UΣV T , then X ∗ = UΣkV T , where the first k diagonal entries of Σk are the largest k singular value... |

87 | An architecture for compressive imaging.
- Wakin, Laska, et al.
- 2006
(Show Context)
Citation Context ...he mn required to transmit the values of all the entries. Consider a given image, whose associated matrix M has low-rank, or can be well-approximated by a low-rank matrix. As proposed by Wakin et al. =-=[54]-=-, a single-pixel camera would ideally produce measurements that are random linear combinations of all the pixels of the given image. Under this situation, the image reconstruction problem boils down e... |

75 | Constructive Approximation: Advanced Problems
- Lorentz, Golitschek, et al.
- 1996
(Show Context)
Citation Context ...ubgroup of the orthogonal group, and is independent of the dimension of the homogeneous space. Hence, one might expect this constant to be quite large. However, it is known that for the sphere C0 ≤ 3 =-=[35]-=-, and there is no indication that this constant is not similarly small for the Grassmannian. We now proceed to the proof of the main result in this section. For this, we use a union bound to combine t... |

72 |
Analysis on symmetric cones. Oxford Mathematical Monographs.
- Faraut, Koranyi
- 1994
(Show Context)
Citation Context ...ality minimization. A convenient mathematical framework that allows the simultaneous consideration of these cases as well as a few new ones, is that of Jordan algebras and the related symmetric cones =-=[24]-=-. In the Jordan-algebraic setting, there is an intrinsic notion of rank that agrees with the cardinality of the support in the case of the nonnegative orthant or the rank of a matrix in the case of th... |

71 | Learning with Matrix Factorizations
- Srebro
- 2004
(Show Context)
Citation Context ...x is low-rank is when the columns are i.i.d. samples of a random process with low-rank covariance. Such models are ubiquitous in Factor Analysis, Collaborative Filtering, and Latent Semantic Indexing =-=[42, 47]-=-. In many of these settings, some prior probability distribution (such as a Bernoulli model or uniform distribution on subsets) is assumed to generate the set of available entries. Suppose we are pres... |

71 |
Portfolio optimization with linear and fixed transaction costs.
- Lobo, Fazel, et al.
- 2007
(Show Context)
Citation Context ...ed to a variety of cardinality minimization problems including image denoising [65], model selection in statistics [76], sparse approximation [20], portfolio optimization with fixed transaction costs =-=[49]-=-, design of sparse interconnect wiring in circuits [80], and design of sparse feedback gains in control systems [43]. Recently, stunning results pioneered by Candès and Tao [12] and Donoho [25] have c... |

66 |
Neighborliness of randomly projected simplices in high dimensions
- Donoho, Tanner
(Show Context)
Citation Context ...imensional settings. Intriguingly, Figure 4 also demonstrates a “phase transition” between perfect recovery and failure. As observed in several recent papers by Donoho and his collaborators (See e.g. =-=[18, 19]-=-), the random sparsity recovery problem has two distinct connected regions of parameter space: one where the sparsity pattern is perfectly recovered, and one where no sparse solution is found. Not sur... |

62 |
Convex Analysis and Minimization Algorithms II: Advanced Theory and Bundle Methods.
- Hiriart-Urruty, Lemarechal
- 1993
(Show Context)
Citation Context ...ity functions Let C be a given convex set. The convex envelope of a (possibly nonconvex) function f : C → R is defined as the largest convex function g such that g(x) ≤ f(x) for all x ∈ C (see, e.g., =-=[32]-=-). This means that among all convex functions, g is the best pointwise approximation to f. In particular, if the optimal g can be conveniently described, it can serve as an approximation to f that can... |

61 |
On the rank minimization problem over a positive semidefinite linear matrix inequality.
- Mesbahi, Papavassilopoulos
- 1997
(Show Context)
Citation Context ...n. In certain instances with very special structure, the rank minimization problem can be solved by using the singular value decomposition, or can be exactly reduced to the solution of linear systems =-=[37, 41]-=-. In general, however, problem (1.1) is a challenging nonconvex optimization problem for which all known finite time algorithms have at least doubly exponential running times in both theory and practi... |

56 |
Sparsity and Incoherence
- Candes, Romberg
- 2007
(Show Context)
Citation Context ...g literature. We also describe possible extensions to more general notions of parsimony. Incoherent Ensembles and Partially Observed Transforms. It would be of great interest to extend the results of =-=[13]-=- to low-rank recovery. In this work, theGUARANTEED MINIMUM-RANK SOLUTIONS VIA NUCLEAR NORM 497 authors show that partially observed unitary transformations of sparse vectors can be used to recover th... |

48 | Local minima and convergence in low-rank semidefinite programming
- Burer, Monteiro
(Show Context)
Citation Context ...imal solution of (2.8) from the solution computed by the method of multipliers. SDPLR and the method of multipliers For general semidefinite programming problems, Burer and Monteiro have developed in =-=[8, 9]-=- a nonlinear programming approach that relies on a low-rank factorization of the matrix decision variable. We will adapt this idea to our problem, to provide a first-order Lagrangian minimization algo... |

46 |
Signal representation using adaptive normalized Gaussian functions,”
- Qian, Chen
- 1994
(Show Context)
Citation Context ...and is known to be NP-hard [56]. Just as in the case of rank minimization, a variety of heuristic algorithms have been proposed to solve cardinality minimization problems including Projection Pursuit =-=[40, 51, 62]-=- and Orthogonal Matching Pursuit [19, 24, 61]. For diagonal matrices, the sum of the singular values is equal to the sum of the absolute values (i.e., the ℓ1 norm) of the diagonal elements. Minimizati... |

43 |
Euclidean Jordan algebras and interior-point algorithms.
- Faybusovich
- 1997
(Show Context)
Citation Context ...uld transparently yield similar results for the case of second-order (or Lorentz) cone constraints. As specific examples of the power and elegance of this approach, we mention the work of Faybusovich =-=[25]-=- and Schmieta and Alizadeh [43] that provide a unified development of interior point methods for symmetric cones, as well as Faybusovich’s work on convexity theorems for quadratic mappings [26]. Parsi... |

43 | An Elementary Proof of a Theorem of
- Dasgupta, Gupta
- 2003
(Show Context)
Citation Context ...e exponential bound in (4.2) guarantees that union bounds will be small even for rather large sets. This concentration is the typical ingredient required to prove the Johnson–Lindenstrauss lemma (cf. =-=[2, 22]-=-). It is often simpler to describe nearly isometric random maps in terms of random matrices. Let D := mn. For a linear map A : R m×n → R p , we can always write its matrix representation as (4.4) A(X)... |

42 | Rank minimization under LMI constraints: A framework for output feedback problems.
- Ghaoui, Gahinet
- 1993
(Show Context)
Citation Context ... a low-rank matrix could correspond to a low-degree statistical model for a random process (e.g., factor analysis), a low-order realization of a linear system [28], a low-order controller for a plant =-=[22]-=-, or a low-dimensional embedding of data in Euclidean space [34]. ∗ Center for the Mathematics of Information, California Institute of Technology † Control and Dynamical Systems, California Institute ... |

38 | A Newton-like method for solving rank constrained linear matrix inequalities.
- Orsi, Helmke, et al.
- 2006
(Show Context)
Citation Context ...t least exponential running time in both theory and practice. For the general case, a variety of heuristic algorithms based on local optimization, including alternating projections and its variations =-=[42, 58]-=-, alternating matrix inequalities [68], linearization [31], and augmented Lagrangian methods [34] have been proposed. A recent heuristic introduced by Fazel et al. in [37, 38] minimizes the nuclear no... |

36 |
Singular value decomposition (SVD) image coding.
- Andrews, Patterson
- 1976
(Show Context)
Citation Context ...rite (1.1) as an affine rank minimization problem. Image Compression A simple and well-known method to compress two-dimensional images can be obtained by using the singular value decomposition (e.g., =-=[3]-=-). The basic idea is to associate to the given grayscale image a rectangular matrix M, with the entries Mij corresponding to the gray level of the (i, j) pixel. The best rank-k approximation of M is g... |

34 | Enhacing sparsity by reweighted ℓ1 minimization
- Candès, Wakin, et al.
- 2008
(Show Context)
Citation Context ... nuclear norm heuristic in practice. This algorithm has been adapted to cardinality minimization, resulting in iterative weighted ℓ1 norm minimization, and has been successful in this setting as well =-=[49, 17]-=-. Another heuristic involves ℓp norm minimization (locally) with p < 1 [18]. These algorithms do not currently have any theoretical guarantees, and it bears investigation if the probabilistic analysis... |

33 | Structured low-rank approximation and its applications.
- Markovsky
- 2008
(Show Context)
Citation Context ...roblem, minimize subject to rank(X) X ∈ C (1.1) where X ∈ Rm×n is the decision variable, and C is some given convex constraint set. This problem arises in various application areas; see, for example, =-=[37, 52]-=-. In certain instances with very special structure, the rank minimization problem can be solved by using the singular value decomposition, or can be exactly reduced to the solution of linear systems [... |

31 | Low-authority controller design via convex optimization,” in
- Hassibi, How, et al.
- 1998
(Show Context)
Citation Context ...76], sparse approximation [20], portfolio optimization with fixed transaction costs [49], design of sparse interconnect wiring in circuits [80], and design of sparse feedback gains in control systems =-=[43]-=-. Recently, stunning results pioneered by Candès and Tao [12] and Donoho [25] have characterized a vast set of instances for which the ℓ1 heuristic can be a priori guaranteed to yield the optimal 2so... |

30 | Distance matrix completion by numerical optimization.
- Trosset
- 2000
(Show Context)
Citation Context ... MDS) comparisons of objects. In computational chemistry, they come up in inferring the three-dimensional structure of a molecule (molecular conformation) from information about interatomic distances =-=[52]-=-. A symmetric matrix D ∈ S n is called a Euclidean distance matrix (EDM) if there exist points x1, . . . , xn in R d such that Dij = �xi − xj� 2 . Let V := In − 1 n 11T be the projection matrix onto t... |

30 | An augmented Lagrangian method for a class of lmiconstrained problems in robust control theory
- Fares, Apkarian, et al.
(Show Context)
Citation Context ...istic algorithms based on local optimization, including alternating projections and its variations [42, 58], alternating matrix inequalities [68], linearization [31], and augmented Lagrangian methods =-=[34]-=- have been proposed. A recent heuristic introduced by Fazel et al. in [37, 38] minimizes the nuclear norm, or the sum of the singular values of the matrix, over the constraint set. The nuclear norm is... |

26 |
Alternating projection algorithms for linear matrix inequalities problems with rank constraints.
- Grigoriadis, Beran
- 2000
(Show Context)
Citation Context ...thms have at least doubly exponential running times in both theory and practice. For the general case, a variety of heuristic algorithms based on local optimization, including alternating projections =-=[31]-=- and alternating LMIs [45], have been proposed. A recent heuristic introduced in [27] minimizes the nuclear norm, or the sum of the singular values of the matrix, over the affine subset. The nuclear n... |

26 | Optimal wire and transistor sizing for circuits with non-tree topology.
- Vandenberghe, Boyd, et al.
- 1997
(Show Context)
Citation Context ...cluding image denoising [65], model selection in statistics [76], sparse approximation [20], portfolio optimization with fixed transaction costs [50], design of sparse interconnect wiring in circuits =-=[80]-=-, and design of sparse feedback gains in control systems [43].GUARANTEED MINIMUM-RANK SOLUTIONS VIA NUCLEAR NORM 473 Recently, results pioneered by Candès and Tao [16] and Donoho [25] have characteri... |

25 |
Deconvolution with the ℓ1 norm.
- Taylor, Banks, et al.
- 1979
(Show Context)
Citation Context ...nal elements. Minimization of the ℓ1 norm is a well-known heuristic for the cardinality minimization problem, employed as early as the 1970s by geophysicists attempting to deconvolve seismic activity =-=[21, 74]-=-. Since then, ℓ1 minimization has been applied to a variety of cardinality minimization problems including image denoising [65], model selection in statistics [76], sparse approximation [20], portfoli... |

24 | On the largest principal angle between random subspaces.
- Absil, Edelman, et al.
- 2006
(Show Context)
Citation Context ...ociated with each subspace. This distance measures the operator norm of the difference between the corresponding projections, and is equal to the sine of the largest principal angle between T1 and T2 =-=[1]-=-. 16sThe set of all d-dimensional subspaces of R D is commonly known as the Grassmannian manifold G(D, d). We will endow it with the metric ρ(·, ·) given by (4.10), also known as the projection 2norm.... |

24 |
Ait Rami, “A cone complementarity linearization algorithm for static output-feedback and related problems
- Ghaoui, Oustry, et al.
- 1997
(Show Context)
Citation Context ...For the general case, a variety of heuristic algorithms based on local optimization, including alternating projections and its variations [42, 58], alternating matrix inequalities [68], linearization =-=[31]-=-, and augmented Lagrangian methods [34], have been proposed. A recent heuristic introduced by Fazel et al. in [37, 38] minimizes the nuclear norm, or the sum of the singular values of the matrix, over... |

23 |
Associative and Jordan algebras, and polynomial time interior-point algorithms for symmetric cones.
- Schmieta, Alizadeh
- 2001
(Show Context)
Citation Context ... results for the case of second-order (or Lorentz) cone constraints. As specific examples of the power and elegance of this approach, we mention the work of Faybusovich [25] and Schmieta and Alizadeh =-=[43]-=- that provide a unified development of interior point methods for symmetric cones, as well as Faybusovich’s work on convexity theorems for quadratic mappings [26]. Parsimonious models and optimization... |

21 |
The finite dimensional basis problem with an appendix on nets of the Grassmann manifold.
- Szarek
- 1983
(Show Context)
Citation Context ... subspaces of Rn to resolution ɛ/2. Then for any (V, W ), there exist i and j such that ρ(V, Vi) ≤ ɛ/2 and ρ(W, Wj) ≤ ɛ/2. Therefore, N(ɛ) ≤ N1N2. By the work of Szarek on ɛ-nets of the Grassmannian (=-=[49]-=-, [50, Th. 8]) there is a universal constant C0, independent of m, n, and r, such that N1 ≤ which completes the proof. � �r(m−r) 2C0 ɛ and N2 ≤ � �r(n−r) 2C0 ɛ (4.18) The exact value of the universal ... |

21 | Metric Entropy of homogeneous spaces - Szarek - 1998 |

19 |
Computational study and comparisons of LFT reducibility methods.
- Beck, D’Andrea
- 1998
(Show Context)
Citation Context ... matrices with norm less than one. When the matrix variable is symmetric and positive semidefinite, this heuristic is equivalent to the trace heuristic often used by the control community (see, e.g., =-=[5, 37]-=-). The nuclear norm heuristic has been observed to produce very low-rank solutions in practice, but a theoretical characterization of when it produces the minimum rank solution has not been previously... |

17 |
Algorithms for the polar decomposition.
- Gander
- 1990
(Show Context)
Citation Context ... to the use of the SVD for the subgradient computation is to directly focus on the “angular” factor of the polar decomposition of Xk, using for instance the Newton-like methods developed by Gander in =-=[30]-=-. Specifically, for a given matrix Xk, the Halley-like iteration X → X(X ′ X + 3I)(3X ′ X + I) −1 21sconverges globally and quadratically to the polar factor of X, and thus yields an element of the su... |

16 |
Metric entropy of homogeneous spaces. In Quantum probability (Gdansk,
- Szarek
- 1997
(Show Context)
Citation Context ...ant C0, independent of m, n, and r, such that N1 ≤ which completes the proof. � �r(m−r) 2C0 ɛ and N2 ≤ � �r(n−r) 2C0 ɛ (4.18) The exact value of the universal constant C0 is not provided by Szarek in =-=[50]-=-. It takes the same value for any homogeneous space whose automorphism group is a subgroup of the orthogonal group, and is independent of the dimension of the homogeneous space. Hence, one might expec... |

8 |
On cone-invariant linear matrix inequalities.
- Parrilo, Khatri
- 2000
(Show Context)
Citation Context ...n. In certain instances with very special structure, the rank minimization problem can be solved by using the singular value decomposition, or can be exactly reduced to the solution of linear systems =-=[37, 41]-=-. In general, however, problem (1.1) is a challenging nonconvex optimization problem for which all known finite time algorithms have at least doubly exponential running times in both theory and practi... |

7 |
New Results about Random Covariance Matrices and Statistical Applications. Stanford Ph
- Karoui
- 2004
(Show Context)
Citation Context ...oment, then the maximum singular value of A is almost surely 1 + � D/p for D sufficiently large [56]. El Karoui uses this result to prove the concentration inequality (4.3) for all such distributions =-=[23]-=-. The result for Gaussians is rather tight with γ = 1/2 (see, e.g., [16]). Finally, note that a random projection also obeys all of the necessary concentration inequalities. Indeed, since the norm of ... |

5 | Jordan-algebraic approach to convexity theorem for quadratic mappings. http://www. optimization-online.org/DB HTML/2005/06/1159.html,
- Faybusovich
- 2005
(Show Context)
Citation Context ...sovich [25] and Schmieta and Alizadeh [43] that provide a unified development of interior point methods for symmetric cones, as well as Faybusovich’s work on convexity theorems for quadratic mappings =-=[26]-=-. Parsimonious models and optimization Sparsity and low-rank are two specific classes of parsimonious (or low-complexity) descriptions. Are there other kinds of easy-to-describe parametric models that... |

4 |
When does rank(A
- Marsaglia, Styan
- 1972
(Show Context)
Citation Context ...be additive, it is necessary and sufficient that the row and column spaces of the two matrices intersect only at the origin, since in this case they operate in essentially disjoint spaces (see, e.g., =-=[53]-=-). As we will show below, a related condition that ensures that the nuclear norm is additive, is that the matrices A and B have row and column spaces that are orthogonal. In fact, a compact sufficient... |

3 |
When does rank (A+B) = rank(A)+ rank(B)? Canad
- Marsaglia, Styan
- 1972
(Show Context)
Citation Context ...be additive, it is necessary and sufficient that the row and column spaces of the two matrices intersect only at the origin, since in this case they operate in essentially disjoint spaces (see, e.g., =-=[36]-=-). As we will show below, a related condition that ensures that the nuclear norm is additive, is that the matrices A and B have row and column spaces that are orthogonal. In fact, a compact sufficient... |

2 |
When does rank
- Marsaglia, Styan
- 1972
(Show Context)
Citation Context ...be additive, it is necessary and sufficient that the row and column spaces of the two matrices intersect only at the origin, since in this case they operate in essentially disjoint spaces (see, e.g., =-=[36]-=-). As we will show below, a related condition that ensures that the nuclear norm is additive, is that the matrices A and B have row and column spaces that are orthogonal. In fact, a compact sufficient... |

1 |
New Results about Random Covariance
- Karoui
- 2004
(Show Context)
Citation Context ...oment, then the maximum singular value of A is almost surely 1 + √ D/p for D sufficiently large [84]. El Karoui used this result to prove the concentration inequality (4.3) for all such distributions =-=[32]-=-. The result for Gaussians is rather tight with γ =1/2 (see, e.g., [23]). Finally, note that a random projection also obeys all of the necessary concentration inequalities. Indeed, since the norm of a... |

1 |
Karoui. New results about random covariance matrices and statistical applications.
- El
- 2004
(Show Context)
Citation Context ... the other has zeros in two-thirds of the entries Aij = √ 3 p with probability 1 6 0 with probability 23 − √ 3 p with probability 1 6 . (4.7) The fact that the top singular value of the matrix A is concentrated around 1 + √ D/p for all of these ensembles follows from the work of Yin, Bai, and Krishnaiah, who showed that whenever the entries Aij are i.i.d. with zero mean and finite fourth moment, then the maximum singular value of A is almost surely 1 + √ D/p for D sufficiently large [56]. El Karoui uses this result to prove the concentration inequality (4.3) for all such distributions [23]. The result for Gaussians is rather tight with γ = 1/2 (see, e.g., [16]). Finally, note that a random projection also obeys all of the necessary concentration inequalities. Indeed, since the norm of a random projection is exactly √ D/p, (4.3) holds trivially. The concentration inequality (4.2) is proven in [15]. The main result of this section is the following: Theorem 4.2 Fix 0 < δ < 1. If A is a nearly isometric random variable, then for every 1 ≤ r ≤ m, there exist constants c0, c1 > 0 depending only on δ such that, with probability at least 1−exp(−c1p), δr(A) ≤ δ whenever p ≥ c0r(m+ n) lo... |

1 |
When does rank (A+B)
- Marsaglia, Styan
- 1972
(Show Context)
Citation Context ...an or equal to the number of non-zeros in x plus the number of non-zeros of y; furthermore (by the triangle inequality) ‖x+ y‖1 ≤ ‖x‖1 + ‖y‖1. In particular, the cardinality function is additive exactly when the vectors x and y have disjoint support. In this case, the `1 norm is also additive, in the sense that ‖x+ y‖1 = ‖x‖1 + ‖y‖1. For matrices, the rank function is subadditive. For the rank to be additive, it is necessary and sufficient that the row and column spaces of the two matrices intersect only at the origin, since in this case they operate in essentially disjoint spaces (see, e.g., [36]). As we will show below, a related condition that ensures that the nuclear norm is additive, is that the matrices A and B have row and column spaces that are orthogonal. In fact, a compact sufficient condition for the additivity of the nuclear norm will be that AB′ = 0 and A′B = 0. This is a stronger requirement than the aforementioned condition for rank additivity, as orthogonal subspaces only intersect at the origin. The disparity arises because the nuclear norm of a linear map depends on the choice of the inner products on the spaces Rm and Rn on which the matrix acts, whereas the rank is ... |