#### DMCA

## An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems

### Citations

5400 | Convex Analysis
- Rockafellar
- 1972
(Show Context)
Citation Context ...onsider a more general unconstrained nonsmooth convex minimization problem of the form: min F (X) := f(X) + P (X), (8) X∈ℜm×n where P : ℜ m×n → (−∞, ∞] is a proper, convex, lower semicontinuous (lsc) =-=[36]-=- function and f is convex smooth (i.e., continuously differentiable) on an open subset of ℜ m×n containing domP = {X | P (X) < ∞}. We assume that domP is closed and ∇f is Lipschitz continuous on domP ... |

4203 | Regression shrinkage and selection via the lasso
- Tibshirani
- 1996
(Show Context)
Citation Context ...ℓ1-regularized linear least squares problem (also known as the basis pursuit de-noising problem) [12]: min x∈ℜn 1 2 ‖Ax − b‖22 + µ‖x‖1, (5) where µ is a given positive parameter; or the Lasso problem =-=[37]-=-: min x∈ℜ n { ‖Ax − b‖ 2 2 : ‖x‖1 ≤ t } , (6) where t is a given positive parameter. It is not hard to see that the problem (5) is equivalent to (6) in the sense that a solution of (5) is also that of... |

3609 | Compressed sensing
- Donoho
- 2006
(Show Context)
Citation Context ...Ax = b } , (3) where ‖x‖0 denotes the number of nonzero components in the vector x, A ∈ ℜ p×n , and min x∈ℜ n { ‖x‖1 : Ax = b } . (4) The problem (4) has attracted much interest in compressed sensing =-=[8, 9, 10, 14, 15]-=- and is also known as the basis pursuit problem. Recently, Recht et al. [35] established analogous theoretical results in the compressed sensing literature for the pair (1) and (2). In the basis pursu... |

2717 | Atomic decomposition by basis pursuit
- Chen, Donoho, et al.
- 1999
(Show Context)
Citation Context ...nimized or constrained. In this case, the appropriate models to consider can either be 2the following ℓ1-regularized linear least squares problem (also known as the basis pursuit de-noising problem) =-=[12]-=-: min x∈ℜn 1 2 ‖Ax − b‖22 + µ‖x‖1, (5) where µ is a given positive parameter; or the Lasso problem [37]: min x∈ℜ n { ‖Ax − b‖ 2 2 : ‖x‖1 ≤ t } , (6) where t is a given positive parameter. It is not ha... |

2621 | Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information
- Candès, Romberg, et al.
- 2006
(Show Context)
Citation Context ...Ax = b } , (3) where ‖x‖0 denotes the number of nonzero components in the vector x, A ∈ ℜ p×n , and min x∈ℜ n { ‖x‖1 : Ax = b } . (4) The problem (4) has attracted much interest in compressed sensing =-=[8, 9, 10, 14, 15]-=- and is also known as the basis pursuit problem. Recently, Recht et al. [35] established analogous theoretical results in the compressed sensing literature for the pair (1) and (2). In the basis pursu... |

1505 | Near optimal signal recovery from random projections: Universal encoding strategies?,”
- Candès, Tao
- 2006
(Show Context)
Citation Context ...Ax = b } , (3) where ‖x‖0 denotes the number of nonzero components in the vector x, A ∈ ℜ p×n , and min x∈ℜ n { ‖x‖1 : Ax = b } . (4) The problem (4) has attracted much interest in compressed sensing =-=[8, 9, 10, 14, 15]-=- and is also known as the basis pursuit problem. Recently, Recht et al. [35] established analogous theoretical results in the compressed sensing literature for the pair (1) and (2). In the basis pursu... |

1056 | A fast iterative shrinkage-thresholding algorithm for linear inverse problems
- Beck, Teboulle
- 2009
(Show Context)
Citation Context ...F ≤ Lf‖X − Y ‖F ∀X, Y ∈ domP, (9) for some positive scalar Lf. The problem (7) is a special case of (8) with f(X) = 1 2 ‖A(X)− b‖ 2 2 and P (X) = µ‖X‖∗ with domP = ℜ m×n . Recently, Beck and Teboulle =-=[4]-=- proposed a fast iterative shrinkage-thresholding algorithm (abbreviated FISTA) to solve (8) for the vector case where n = 1 and domP = ℜ m , targeting particularly (5) arising in signal/image process... |

745 | A iterative thresholding algorithm for linear inverse problems with a sparsity constraint
- Daubechies, Defrise, et al.
- 2004
(Show Context)
Citation Context ... 1 2 ‖AX − b‖2 2, P (X) = µ‖X‖1 and n = 1 (hence X ∈ ℜ m ), it is the popular iterative shrinkage/thresholding (IST) algorithms that have been developed and analyzed independently by many researchers =-=[13, 20, 21, 24]-=-. When P ≡ 0 in the problem (8), Algorithm 1 with t k = 1 for all k reduces to the standard gradient algorithm. For the gradient algorithm, it is known that the sequence of function 5values F (Xk) ca... |

562 | Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization
- Recht, Fazel, et al.
- 2010
(Show Context)
Citation Context ...p×n , and min x∈ℜ n { ‖x‖1 : Ax = b } . (4) The problem (4) has attracted much interest in compressed sensing [8, 9, 10, 14, 15] and is also known as the basis pursuit problem. Recently, Recht et al. =-=[35]-=- established analogous theoretical results in the compressed sensing literature for the pair (1) and (2). In the basis pursuit problem (4), b is a vector of measurements of the signal x obtained by us... |

555 | A singular value thresholding algorithm for matrix completion
- Cai, Candes, et al.
- 2008
(Show Context)
Citation Context ...+ p 2 (m + n) 2 + p 3 ) and the memory requirement grows like O((m + n) 2 + p 2 ). 3.2 Review of existing algorithms for matrix completion In this section, we review two recently developed algorithms =-=[6, 28]-=- for solving the matrix completion problem. 8First of all, we consider the following minimization problem: min X∈ℜm×n τ 2 ‖X − G‖2F + µ‖X‖∗, (18) where G is a given matrix in ℜm×n . If G = Y − τ −1A∗... |

521 | Smooth minimization of non-smooth functions
- Nesterov
(Show Context)
Citation Context ...d promising numerical results for wavelet-based image deblurring. This algorithm is in the class of accelerated proximal gradient algorithms that were studied by Nesterov, Nemirovski, and others; see =-=[30, 31, 32, 34, 40]-=- and references therein. These accelerated proximal gradient algorithms have an attractive iteration complexity of O(1/ √ ɛ) for achieving ɛ-optimality; see Section 32. We extend Beck and Teboulle’s ... |

377 | Eigentaste: A constant time collaborative filtering algorithm
- Goldberg, Roeder, et al.
(Show Context)
Citation Context ....20e+04 4.87e-02 204.4 Numerical experiments on real matrix completion problems In this section, we consider matrix completion problems based on some real data sets, namely, the Jester joke data set =-=[23]-=- and the MovieLens data set [25]. The Jester joke data set contains 4.1 million ratings for 100 jokes from 73421 users and is available on the website http://www.ieor.berkeley.edu/~goldberg/jester-dat... |

371 | Sparse reconstruction by separable approximation
- Wright, Nowak, et al.
- 2009
(Show Context)
Citation Context ...int Sτ(G) can be found in closed form, which is an advantage of algorithms using (12) to update the current point (i.e., αk = 1 for all k) or compute a direction for large-scale optimization problems =-=[24, 41, 42, 43]-=-. When the Algorithm 1, with fixed constants τ k > 0, tk = 1, and αk = 1 for all k, is applied to the problem (5), i.e., (8) with f(X) = 1 2 ‖AX − b‖2 2, P (X) = µ‖X‖1 and n = 1 (hence X ∈ ℜ m ), it i... |

338 |
Problem complexity and method efficiency in optimization. Wiley-Interscience series in discrete mathematics.
- Nemirovskii, Yudin
- 1983
(Show Context)
Citation Context ...d promising numerical results for wavelet-based image deblurring. This algorithm is in the class of accelerated proximal gradient algorithms that were studied by Nesterov, Nemirovski, and others; see =-=[30, 31, 32, 34, 40]-=- and references therein. These accelerated proximal gradient algorithms have an attractive iteration complexity of O(1/ √ ɛ) for achieving ɛ-optimality; see Section 32. We extend Beck and Teboulle’s ... |

298 |
A method for solving the convex programming problem with convergence rate
- Nesterov
- 1983
(Show Context)
Citation Context ...d promising numerical results for wavelet-based image deblurring. This algorithm is in the class of accelerated proximal gradient algorithms that were studied by Nesterov, Nemirovski, and others; see =-=[30, 31, 32, 34, 40]-=- and references therein. These accelerated proximal gradient algorithms have an attractive iteration complexity of O(1/ √ ɛ) for achieving ɛ-optimality; see Section 32. We extend Beck and Teboulle’s ... |

287 |
Matrix Rank Minimization with Applications.
- Fazel
- 2002
(Show Context)
Citation Context ...n embedding [39]. In general, this affine rank minimization problem (1) is an NP-hard nonconvex optimization problem. A recent convex relaxation of this affine rank minimization problem introduced in =-=[19]-=- minimizes the nuclear norm over the same constraints: min X∈ℜ m×n { ‖X‖∗ : A(X) = b } . (2) The nuclear norm is the best convex approximation of the rank function over the unit ball of matrices. A pa... |

274 | A rank minimization heuristic with application to minimum order system approximation.
- Fazel, Hindi, et al.
- 2001
(Show Context)
Citation Context ...} , (1) where A : ℜm×n → ℜp is a linear map and b ∈ ℜp . We denote the adjoint of A by A∗ . The problem (1) has appeared in the literature of diverse fields including machine learning [1, 3], control =-=[17, 18, 29]-=-, and Euclidean embedding [39]. In general, this affine rank minimization problem (1) is an NP-hard nonconvex optimization problem. A recent convex relaxation of this affine rank minimization problem ... |

216 | Fast Monte Carlo algorithms for matrices I-III: computing a compressed approximate matrix decompositon,
- Drineas, Kannan, et al.
- 2006
(Show Context)
Citation Context ...n each iteration of the FPC and SVT algorithms lies in computing the SVD of Gk . In [28], Ma et al. uses a fast Monte Carlo algorithm such as the Linear Time SVD algorithm developed by Drineas et al. =-=[16]-=- to reduce the time for computing the SVD. In addition, they compute only the predetermined svk largest singular values and corresponding singular vectors to further reduce the computational time at e... |

196 | Fixed point and Bregman iterative methods for matrix rank minimization
- Ma, Goldfarb, et al.
- 2011
(Show Context)
Citation Context ...arized linear least squares problem (7) and introduce three techniques to accelerate the convergence of our algorithm. In section 4, we compare our algorithm with a fixed point continuation algorithm =-=[28]-=- for solving (7) on randomly generated matrix completion problems with moderate dimensions. We also present numerical results for solving a set of large-scale randomly generated matrix completion prob... |

181 | Quantitative robust uncertainty principles and optimally sparse decompositions
- Candes, Romberg
- 2006
(Show Context)
Citation Context |

160 |
A coordinated gradient descent method for nonsmooth separable minimizatin
- Tseng, Yun
(Show Context)
Citation Context ...int Sτ(G) can be found in closed form, which is an advantage of algorithms using (12) to update the current point (i.e., αk = 1 for all k) or compute a direction for large-scale optimization problems =-=[24, 41, 42, 43]-=-. When the Algorithm 1, with fixed constants τ k > 0, tk = 1, and αk = 1 for all k, is applied to the problem (5), i.e., (8) with f(X) = 1 2 ‖AX − b‖2 2, P (X) = µ‖X‖1 and n = 1 (hence X ∈ ℜ m ), it i... |

141 | Introductory Lectures on Convex Optimization, - Nesterov - 2004 |

70 |
A bound optimization approach to wavelet-based image deconvolution
- Figueiredo, Nowak
(Show Context)
Citation Context ... 1 2 ‖AX − b‖2 2, P (X) = µ‖X‖1 and n = 1 (hence X ∈ ℜ m ), it is the popular iterative shrinkage/thresholding (IST) algorithms that have been developed and analyzed independently by many researchers =-=[13, 20, 21, 24]-=-. When P ≡ 0 in the problem (8), Algorithm 1 with t k = 1 for all k reduces to the standard gradient algorithm. For the gradient algorithm, it is known that the sequence of function 5values F (Xk) ca... |

66 |
Neighborliness of randomly projected simplices in high dimensions
- Donoho, Tanner
(Show Context)
Citation Context |

61 | On the rank minimization problem over a positive semidefinite linear matrix inequality. - Mesbahi, Papavassilopoulos - 1997 |

49 |
PROPACK—Software for large and sparse SVD calculations,” http://soi.stanford.edu/ rmunk/PROPACK,
- Larsen
- 2004
(Show Context)
Citation Context ...ctor ϱk−1 used to form Xk−1 = U k−1Diag(ϱk−1 )(V k−1 ) T . And if the non-expansive property (see [28, Lemma 1]) is violated 10 times, svk is increased by 1. In contrast, Cai, et al. [6] used PROPACK =-=[26]-=- (a variant of the Lanczos algorithm) to compute a partial SVD of Gk . They also compute only the predetermined svk largest singular values and corresponding singular vectors to reduce the computation... |

42 | Rank minimization under LMI constraints: A framework for output feedback problems.
- Ghaoui, Gahinet
- 1993
(Show Context)
Citation Context ...} , (1) where A : ℜm×n → ℜp is a linear map and b ∈ ℜp . We denote the adjoint of A by A∗ . The problem (1) has appeared in the literature of diverse fields including machine learning [1, 3], control =-=[17, 18, 29]-=-, and Euclidean embedding [39]. In general, this affine rank minimization problem (1) is an NP-hard nonconvex optimization problem. A recent convex relaxation of this affine rank minimization problem ... |

41 |
Low-rank matrix factorization with attributes.
- Abernethy, Bach, et al.
- 2006
(Show Context)
Citation Context ...k(X) : A(X) = b } , (1) where A : ℜm×n → ℜp is a linear map and b ∈ ℜp . We denote the adjoint of A by A∗ . The problem (1) has appeared in the literature of diverse fields including machine learning =-=[1, 3]-=-, control [17, 18, 29], and Euclidean embedding [39]. In general, this affine rank minimization problem (1) is an NP-hard nonconvex optimization problem. A recent convex relaxation of this affine rank... |

38 |
A fixed-point continuation method for ℓ1-regularized minimization with applications to compressed sensing.
- Hale, Yin, et al.
- 2007
(Show Context)
Citation Context ...ant of (5) or (6), provided that the matrix A satisfies certain restricted isometry property. Many algorithms have been proposed to solve (5) and (6), targeting particularly large-scale problems; see =-=[24, 37, 42]-=- and references therein. This motivates us to consider an alternative convex relaxation to the affine rank minimization problem, namely, the following nuclear norm regularized linear least squares pro... |

31 |
A generalized proximal point algorithm for certain non-convex minimization problems.
- Fukushima, Mine
- 1981
(Show Context)
Citation Context ...I, X〉 is smooth on ℜn×n and so we are able to compute Sτ k(Gk ). In addition, since Sτ k(Gk ) ∈ Sn +, we have Xk ∈ Sn + if αk ≤ 1 for all k. For the vector case where n = 1 in (8), Fukushima and Mine =-=[22]-=- studied a proximal gradient descent method using (11) to compute a descent direction (i.e., Algorithm 1 with tk = 1 for all k) with stepsize αk chosen by an Armijo-type rule. If P is separable, the m... |

30 | Distance matrix completion by numerical optimization.
- Trosset
- 2000
(Show Context)
Citation Context ...r map and b ∈ ℜp . We denote the adjoint of A by A∗ . The problem (1) has appeared in the literature of diverse fields including machine learning [1, 3], control [17, 18, 29], and Euclidean embedding =-=[39]-=-. In general, this affine rank minimization problem (1) is an NP-hard nonconvex optimization problem. A recent convex relaxation of this affine rank minimization problem introduced in [19] minimizes t... |

25 |
On an approach to the construction of optimal methods of minimization of smooth convex functions.
- Nesterov
- 1988
(Show Context)
Citation Context |

7 | A coordinate gradient descent method for ℓ1-regularized convex minimization
- Yun, Toh
(Show Context)
Citation Context ...ant of (5) or (6), provided that the matrix A satisfies certain restricted isometry property. Many algorithms have been proposed to solve (5) and (6), targeting particularly large-scale problems; see =-=[24, 37, 42]-=- and references therein. This motivates us to consider an alternative convex relaxation to the affine rank minimization problem, namely, the following nuclear norm regularized linear least squares pro... |

3 |
An EM algorithm for wavelet-bassed image restoration
- Figueiredo, Nowak
(Show Context)
Citation Context ... 1 2 ‖AX − b‖2 2, P (X) = µ‖X‖1 and n = 1 (hence X ∈ ℜ m ), it is the popular iterative shrinkage/thresholding (IST) algorithms that have been developed and analyzed independently by many researchers =-=[13, 20, 21, 24]-=-. When P ≡ 0 in the problem (8), Algorithm 1 with t k = 1 for all k reduces to the standard gradient algorithm. For the gradient algorithm, it is known that the sequence of function 5values F (Xk) ca... |

2 |
Uncovering shated structures in multiclass classification
- Amit, Fink, et al.
- 2007
(Show Context)
Citation Context ...k(X) : A(X) = b } , (1) where A : ℜm×n → ℜp is a linear map and b ∈ ℜp . We denote the adjoint of A by A∗ . The problem (1) has appeared in the literature of diverse fields including machine learning =-=[1, 3]-=-, control [17, 18, 29], and Euclidean embedding [39]. In general, this affine rank minimization problem (1) is an NP-hard nonconvex optimization problem. A recent convex relaxation of this affine rank... |

2 |
SDPT3 - a MATLAB software package for semedefinite programming, Optimization Methods and Software 11
- Toh, Todd, et al.
- 1999
(Show Context)
Citation Context ...ies, and it can be done by solving an aforementioned convex relaxation (2) of (1), i.e., min X∈ℜ m×n { ‖X‖∗ : Xij = Mij, (i, j) ∈ Ω } . (16) In [11], the convex relaxation (16) was solved using SDPT3 =-=[38]-=-, which is one of the most advanced semidefinite programming solvers. The problem (16) can be reformulated as a semidefinite program as follows; see [35] for details: min X,W1,W2 1 2 (〈W1, Im〉 + 〈W2, ... |

1 |
Matrix completion with noise, preprint, Caltech
- Candés, Plan
- 2009
(Show Context)
Citation Context ...ents, and yet the errors are all smaller than the noise level (nf = 0.1) in the given data. The errors obtained here are consistent with (actually more accurate) the theoretical result established in =-=[7]-=-. Table 3: Numerical results on random matrix completion problems without noise. Unknown M Results n p r p/dr µ iter #sv time error 1000 119406 10 6 1.44e-02 40 10 4.30e+00 1.81e-04 389852 50 4 5.39e-... |

1 |
Convex optimization methods for dimension reduction and coefficient estimation in multivariate linear regression, January 2008 (revised
- C, Yuan
- 2009
(Show Context)
Citation Context ...s, such as interior point methods, or semismooth Newton methods, developed for (5) be extended to solve (7)? Second, can other acceleration techniques be developed for solving (7)? Recently Lu et al. =-=[27]-=- considered a large-scale nuclear norm regularized least squares problem (7) arising from multivariate linear regression, where the linear map A is given by A(X) = ΛX for X ∈ ℜ m×n , and Λ ∈ ℜ m×m is ... |