#### DMCA

## Robust pca with partial subspace knowledge,” (2014)

Venue: | in IEEE Intl. Symp. on Information Theory (ISIT), |

Citations: | 5 - 2 self |

### Citations

2306 | Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection
- Belhumeur, Hespanha, et al.
- 1997
(Show Context)
Citation Context ...5 0.1 0.2 0.3 0.4 0.5 (c) mod-PCP, n2 = 200 ρ s r/n1 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 (d) PCP, n2 = 200 Fig. 4: Phase transition plots with rnew = ⌊0.15r⌋, rextra = ⌊0.15r⌋, n1 = 400 C. Real data (face reconstruction application) As stated in [3], robust PCA is useful in face recognition to remove sparse outliers, like cast shadows, specularities or eyeglasses, from a sequence of images of the same face. As explained there, without outliers, face images arranged as columns of a matrix are known to form an approximately low-rank matrix. Here we use the images from the Yale Face Database [33] that is also used in [3]. Outlier-free training data consisting of face images taken under a few illumination conditions, but all without eyeglasses, is used to obtain a partial subspace estimate. The test data consists of face images under different lighting conditions and with eyeglasses or other outliers. For test data, the goal is to reconstruct a clear face image with the cast shadows, eyeglasses or other outliers removed. Thus, the clear face image should be a column of the estimated low-rank matrix while the cast shadows or eyeglasses should be a column of the sparse matrix. Each image... |

933 | Robust face recognition via sparse representation
- Wright, Yang, et al.
- 2009
(Show Context)
Citation Context ... to 99% energy. This results in rG = 38. We use another two face images per subject for each of the twelve subjects, some with glasses and some without, as the test data, i.e. the measurement matrix M. Thus M is 19520× 24. In the experiments, we compare modified-PCP with PCP [3] and ReProCS [20], [21] and also with some of the other algorithms compared in [21]: robust subspace learning (RSL) [34], which is a batch robust PCA algorithm that was compared against in [3], and GRASTA [35], which is a very recent online robust PCA algorithm. We also compare against Dense Error Correction (DEC) [2], [36] since this first addressed this application using ℓ1 minimization. To implement Dense Error Correction (DEC) [2], [36], we normalize each column of MG to get the dictionary (D)n1×48, and we solve (xi, si) = argmin x,s ∥x∥1 + ∥s∥1 subject to Mi = Dx+ s using YALL-1. Here Mi is the ith column of M. The solution gives us si and ℓi = Dxi. For PCP and RSL, we use the test dataset only, i.e., M, which is a 19520× 24 matrix, as the measurement matrix. DEC, ReProCS and GRASTA are provided the same partial knowledge that mod-PCP gets. Fig. 5 shows 3 cases where mod-PCP successfully removes ... |

864 | Exact matrix completion via convex optimization - Candes, Recht - 2009 |

560 | Robust principal component analysis
- Candès, Li, et al.
- 2011
(Show Context)
Citation Context ...bust PCA problem occurs in various applications ranging from video analysis to recommender system design in the presence of outliers, e.g. for Netflix movies, to anomaly detection in dynamic networks =-=[2]-=-. In video analysis, background image sequences are well modeled as forming a low-rank but dense matrix because they change slowly over time and the changes are typically global. Foreground is a spars... |

554 | A singular value thresholding algorithm for matrix completion
- Cai, Candès, et al.
- 2010
(Show Context)
Citation Context ...tial background-only sequence. For this application, modified-PCP can be used to design a piecewise batch solution that will be faster and will require weaker assumptions for exact recovery than PCP. This is made precise in Corollary IV.1. We also show extensive simulation comparisons and some real data comparisons of modified-PCP with PCP and with other existing robust PCA solutions from literature. The implementation requires a fast algorithm for solving the modifiedPCP program. We develop this by modifying the Inexact Augmented Lagrange Multiplier Method of [15] and using the idea of [16], [17] for the sparse recovery step. Notation. For a matrix X, we denote by X∗ the transpose of X; denote by ∥X∥∞ the ℓ∞ norm of X reshaped as a long vector, i.e., maxi,j |Xij |; denote by ∥X∥ the operator norm or 2-norm; denote by ∥X∥F the Frobenius norm. Let I denote the identity operator, i.e., I(Y) = Y for any matrix Y. Let ∥A∥ denote the operator norm of operator A, 2 i.e., ∥A∥ = sup{∥X∥F=1} ∥AX∥F ; let ⟨X,Y⟩ denote the Euclidean inner product between two matrices, i.e., trace(X∗Y); let sgn(X) denote the entrywise sign of X. We let PΘ denote the orthogonal projection onto a linear subspace Θ of... |

475 | Just relax: convex programming methods for identifying sparse signals in noise
- Tropp
- 2006
(Show Context)
Citation Context .... solve (8). This algorithm is a direct modification of the algorithm designed to solve PCP in [29]. Using the same ideas, along with a accurate recovery result for the basis pursuit denoising (BPDN) =-=[30]-=- problem, it should be possible to prove that the output of the algorithm converges to the solution of modified-PCP. For the modified-PCP program (8), the Augmented Lagrangian function is: L(L̃, S̃, Y... |

349 | Introduction to the non-asymptotic analysis of random matrices
- Vershynin
- 2011
(Show Context)
Citation Context ...n Appendix B. A similar statement is given in Appendix A.1 of [3] but without a proof. The expression for the second term on the right hand side given there is e− 2n1n2ϵ 2 0 ρ0 which is different from the one we derive. Lemma V.2. Let E be a n1 × n2 random matrix with entries i.i.d. (independently identically distributed) as Eij = 1, w. p. ρs/2, 0, w. p. 1− ρs, −1, w. p. ρs/2. (15) If ρs < 0.03 and (n1+n2) 1/6 log(n1+n2) > 10.5 (ρs)1/6(1−5.6561 √ ρs) , then P(∥E∥ ≥ 0.5√n(1)) ≤ n−10(1) . The proof is provided in Appendix C and uses the result of [24]. In [3], the authors claim that using [25], ∥E∥ > 0.25 √ n(1) w.p. less than n −10 (1) . While the claim is correct, it is not possible to prove it using any of the results from [25]. Using ideas from [25], one can only show that the above holds when n(2) is upper bounded by a constant times log n(1) (see the Appendix H) which is a strong extra assumption. B. Proof Architecture The proof of the theorem involves 4 main steps. (a) The first step is to show that when the locations of the support of S are Bernoulli distributed with parameter ρs and the signs of S are i.i.d ±1 with probability 1/2 (and independent from the locations), and ... |

328 | The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices
- Lin, Chen, et al.
- 2010
(Show Context)
Citation Context ...n of the assumptions needed by PCP and mod-PCP for the simulated data. A. Algorithm for solving Modified-PCP We give below an algorithm based on the Inexact Augmented Lagrange Multiplier (ALM) method =-=[29]-=- to solve the modified-PCP program, i.e. solve (8). This algorithm is a direct modification of the algorithm designed to solve PCP in [29]. Using the same ideas, along with a accurate recovery result ... |

325 | Online learning for matrix factorization and sparse coding
- Mairal, Bach, et al.
(Show Context)
Citation Context ...e been shown to approximate the batch solution and do so only asymptotically. Other somewhat related work includes online algorithms for low-rank matrix completion and dictionary learning [25], [26], =-=[27]-=-. III. PROOF OUTLINE The overall proof approach is similar to that in [2]. The first step involves starting with the KKT conditions and relaxing them to find a set of conditions under which Lnew, S is... |

325 | Combinatorial problems and exercises, - Lovasz - 2007 |

226 |
Fundaments of convex analysis
- Hiriart-Urruty, Lemaréchal
- 2004
(Show Context)
Citation Context ...= 0.4ρsn1n2, and all the other assumptions on L, n1, n2, ρs, ρr from Theorem III.1 are satisfied. Thus, all we need to do is to prove step (a). To do this we start with the KKT conditions and strengthen them to get a set of easy to satisfy sufficient conditions on the dual certificate under which Lnew,S is the unique minimizer of (7). This is done in Sec V-E. Next, we use the golfing scheme [26], [3] to construct a dual certificate that satisfies the required conditions (Sec. V-F). C. Basic Facts We state some basic facts which will be used in the following proof. Definition V.3 (Sub-gradient [27]). Consider a convex function f : O → R on a convex set of matrices O. A matrix Y is called its sub-gradient at a point X0 ∈ O if f(X)− f(X0) ≥ ⟨Y, (X−X0)⟩. for all X ∈ O. The set of all sub-gradients of f at X0 is denoted by ∂f(X0). It is known [28], [29] that ∂∥Lnew∥∗ = {UnewV∗new +W : PTnewW = 0, ∥W∥ ≤ 1}. and ∂∥S∥1 = {F : PΩF = sgn(S), ∥F∥∞ ≤ 1}. Definition V.4 (Dual norm [8]). The matrix norm ∥ · ∥♡ is said to be dual to matrix norm ∥ · ∥♠ if, for all Y1 ∈ Rn1×n2 , ∥Y1∥♡ = sup∥Y2∥♠≤1⟨Y1,Y2⟩. Proposition V.5 (Proposition 2.1 of [30]). The following pairs of matrix norms are dual to each ot... |

225 | Rank-sparsity incoherence for matrix decomposition - Chandrasekaran, Sanghavi, et al. |

185 | Recovering low-rank matrices from few coefficients in any basis
- Gross
(Show Context)
Citation Context ... > ∥Lnew∥∗ + λ∥S∥1 The last inequality holds because ∥PΩPΠ∥ < 1 and this implies that Π ∩ Ω = {0} and so at least one of PΠ⊥H or PΩ⊥H is strictly positive for H = 0. Thus, the cost function is strictly increased by any feasible perturbation. Since the cost is convex, this proves the lemma. Lemma V.7 is equivalently saying that (Lnew,S,L∗G) is the unique solution to Modified-PCP (7) if there is a W satisfying: W ∈ Π⊥, ∥W∥ ≤ 9/10, ∥PΩ(UnewV∗new − λsgn(S) +W)∥F ≤ λ/4, ∥PΩ⊥(UnewV∗new +W)∥∞ < 9λ/10. (20) F. Construction of the required dual certificate The golfing scheme is introduced by [32], [26]; here we use it with some modifications similar to those in [3] to construct dual certificate. Assume that Ω v Ber(ρs) or equivalently, Ωc v Ber(1− ρs). Notice that Ωc can be generated as a union of j0 i.i.d. sets {Ωj}j0j=1, where Ωj i.i.dv Ber(q), 1 ≤ j ≤ j0 with q, j0 satisfying ρs = (1− q)j0 . This is true because P((i, j) ∈ Ω) = P((i, j) /∈ Ω1 ∪ Ω2 ∪ · · · Ωj0) = (1− q)j0 . As there is overlap between Ω′js, we have q ≥ (1− ρs)/j0. Let W = WL+WS , where WL,WS are constructed similar to [3] as: • Construction of WL via the golfing scheme. Let Y0 = 0, Yj = Yj−1 + q −1PΩjPΠ(UnewV... |

177 | The eigenvalues of random symmetric matrices
- Füredi, Komlós
- 1981
(Show Context)
Citation Context ...here ρ0 = m0n1n2 + ϵ0. The proof is given in Appendix B. A similar statement is given in Appendix A.1 of [3] but without a proof. The expression for the second term on the right hand side given there is e− 2n1n2ϵ 2 0 ρ0 which is different from the one we derive. Lemma V.2. Let E be a n1 × n2 random matrix with entries i.i.d. (independently identically distributed) as Eij = 1, w. p. ρs/2, 0, w. p. 1− ρs, −1, w. p. ρs/2. (15) If ρs < 0.03 and (n1+n2) 1/6 log(n1+n2) > 10.5 (ρs)1/6(1−5.6561 √ ρs) , then P(∥E∥ ≥ 0.5√n(1)) ≤ n−10(1) . The proof is provided in Appendix C and uses the result of [24]. In [3], the authors claim that using [25], ∥E∥ > 0.25 √ n(1) w.p. less than n −10 (1) . While the claim is correct, it is not possible to prove it using any of the results from [25]. Using ideas from [25], one can only show that the above holds when n(2) is upper bounded by a constant times log n(1) (see the Appendix H) which is a strong extra assumption. B. Proof Architecture The proof of the theorem involves 4 main steps. (a) The first step is to show that when the locations of the support of S are Bernoulli distributed with parameter ρs and the signs of S are i.i.d ±1 with probability 1/2... |

172 | A framework for robust subspace learning
- Torre, Black
- 2003
(Show Context)
Citation Context ... with center-light, right-light, left-light, and normal-light – for each of these 12 subjects. Thus the training data matrix MG is 19520× 48. We compute G by keeping its left singular vectors corresponding to 99% energy. This results in rG = 38. We use another two face images per subject for each of the twelve subjects, some with glasses and some without, as the test data, i.e. the measurement matrix M. Thus M is 19520× 24. In the experiments, we compare modified-PCP with PCP [3] and ReProCS [20], [21] and also with some of the other algorithms compared in [21]: robust subspace learning (RSL) [34], which is a batch robust PCA algorithm that was compared against in [3], and GRASTA [35], which is a very recent online robust PCA algorithm. We also compare against Dense Error Correction (DEC) [2], [36] since this first addressed this application using ℓ1 minimization. To implement Dense Error Correction (DEC) [2], [36], we normalize each column of MG to get the dictionary (D)n1×48, and we solve (xi, si) = argmin x,s ∥x∥1 + ∥s∥1 subject to Mi = Dx+ s using YALL-1. Here Mi is the ith column of M. The solution gives us si and ℓi = Dxi. For PCP and RSL, we use the test dataset only,... |

165 | Fast computation of low-rank matrix approximations - Achlioptas, McSherry |

116 | Modified-cs: Modifying compressive sensing for problems with partially known support
- Vaswani, Lu
- 2010
(Show Context)
Citation Context ...ng rank deficient. Hence, in this case, mod-PCP will require a different approach to selecting λ and will require more assumptions. Modified-PCP can be interpreted as adapting the idea of modified-CS =-=[18]-=- to the robust PCA problem. Modified-CS solves the problem of sparse recovery with partial support knowledge. Its solution idea is to try to find the vector that is sparsest on the complement set of t... |

92 | Compressive principal component pursuit
- Wright, Ganesh, et al.
- 2013
(Show Context)
Citation Context ... 96×144. For the first 600 background images we form a low rank matrix [MG L] by stacking each image as a column (the first 300 columns belong to MG and the rest belong to L). With the same steps for lake sequence, we get ρr(PCP) is 4.311× 104 and ρr(mod-PCP) is 1.7866× 104. F. Comparison with Simulated Noisy Data In order to address an anonymous reviewer’s comment, we have also added simulations with noisy data. We assume the measurement model M = L+ S+ Z (23) where L is low rank (with partial knowledge G similar to previous case), S is sparse and Z is a noise term with ∥Z∥F ≤ σ. Inspired by [38], we propose the following optimization 11 0 20 40 60 80 0 0.2 0.4 0.6 0.8 1 1.2 1.4 t ‖ s t − s t ‖ 2 / ‖ s t ‖ 2 mod−PCP ReProCS PCP GRASTA RSL GOSUS Fig. 8: Lake sequence NMSE comparison. problem to solve the problem: minimizeLnew,S,X ∥Lnew∥∗ + λ∥S∥1 subject to ∥Lnew +GX∗ + S−M∥F ≤ σ (24) with λ = √ max{n1, n2}. To compare the result with stable PCP [38], we generated square matrices as stated in [38, Section V], i.e., n1 = n2 = 200, r = 10, rnew = 2, rextra = 0, ρs = 0.2, L = XY∗ where X and Y are independent n1 × r i.i.d. N (0, 1/n1) matrices, and each entry of S is independently... |

90 |
Characterization of the subdifferential of some matrix norms, Linear Algebra and its Applications
- Watson
- 1992
(Show Context)
Citation Context ...tions on the dual certificate under which Lnew,S is the unique minimizer of (7). This is done in Sec V-E. Next, we use the golfing scheme [26], [3] to construct a dual certificate that satisfies the required conditions (Sec. V-F). C. Basic Facts We state some basic facts which will be used in the following proof. Definition V.3 (Sub-gradient [27]). Consider a convex function f : O → R on a convex set of matrices O. A matrix Y is called its sub-gradient at a point X0 ∈ O if f(X)− f(X0) ≥ ⟨Y, (X−X0)⟩. for all X ∈ O. The set of all sub-gradients of f at X0 is denoted by ∂f(X0). It is known [28], [29] that ∂∥Lnew∥∗ = {UnewV∗new +W : PTnewW = 0, ∥W∥ ≤ 1}. and ∂∥S∥1 = {F : PΩF = sgn(S), ∥F∥∞ ≤ 1}. Definition V.4 (Dual norm [8]). The matrix norm ∥ · ∥♡ is said to be dual to matrix norm ∥ · ∥♠ if, for all Y1 ∈ Rn1×n2 , ∥Y1∥♡ = sup∥Y2∥♠≤1⟨Y1,Y2⟩. Proposition V.5 (Proposition 2.1 of [30]). The following pairs of matrix norms are dual to each other: • ∥ · ∥1 and ∥ · ∥∞; • ∥ · ∥∗ and ∥ · ∥; • ∥ · ∥F and ∥ · ∥F . For all these pairs, the following hold. 1) |⟨Y,Z⟩ |≤ ∥Y∥♠∥Z∥♡. 2) Fixing any Y ∈ Rn1×n2 , there exists Z ∈ Rn1×n2 (that depends on Y) such that ⟨Y,Z⟩ = ∥Y∥♠∥Z∥♡. 3) In particular, we can ... |

86 | Robust pca via outlier pursuit
- Xu, Caramanis, et al.
- 2010
(Show Context)
Citation Context ...omplexity) robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see [4], [5], [6], =-=[7]-=-, [8], [9], [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. How can we use t... |

86 |
Alternating direction algorithms for l1 problems in compressive sensing
- Yang, Zhang
- 2010
(Show Context)
Citation Context ...= S̃k, L̂ = M − S̃k. λ∥S̃k∥1+ < Y,PΓ?M−PΓ? S̃k−L̃ > + τ2∥PΓ?M−PΓ? S̃k− L̃∥2F . The soft-thresholding operator is defined as Sϵ[x] = x− ϵ, if x > ϵ;x+ ϵ, if x < −ϵ; 0, otherwise, (13) We use yall1 =-=[31]-=- to solve Line 5. Parameters are set as suggested in [29], i.e., τ0 = 1.25/∥PΓ?M∥, v = 1.5, τ̄ = 107τ0 and iteration is stopped when ∥PΓ?M − PΓ? S̃k+1 − L̃k+1∥F /∥PΓ?M∥F < 10−7. The above algorithm is... |

80 |
Quantum State Tomography via compressed sensing
- Gross, Lou, et al.
- 2009
(Show Context)
Citation Context ...er of (8) (Lemma 3.1). These conditions are further relaxed to get a set of conditions on the dual certificate that are easy to satisfy, as is also done in [2] (Lemma 3.2). Finally the golfing scheme =-=[28]-=-, [2] is used to construct this dual certificate and to show that it indeed satisfies the required conditions. The proof needs the following linear space of matrices. Π := {[G Unew]X∗+Y V ∗new, X ∈ Rn... |

78 | Online identification and tracking of subspaces from highly incomplete information.
- Balzano, Nowak, et al.
- 2010
(Show Context)
Citation Context ...ubspace) have been shown to approximate the batch solution and do so only asymptotically. Other somewhat related work includes online algorithms for low-rank matrix completion and dictionary learning =-=[25]-=-, [26], [27]. III. PROOF OUTLINE The overall proof approach is similar to that in [2]. The first step involves starting with the KKT conditions and relaxing them to find a set of conditions under whic... |

74 |
Recovering low-rank and sparse components of matrices from incomplete and noisy observations
- Tao, Yuan
- 2011
(Show Context)
Citation Context ... algorithm. Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see [4], [5], [6], [7], [8], [9], [10], =-=[11]-=-, [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. How can we use this information to imp... |

68 | Fixed-point continuation for ℓ1minimization: methodology and convergence
- Hale, Yin, et al.
- 2008
(Show Context)
Citation Context ...an initial background-only sequence. For this application, modified-PCP can be used to design a piecewise batch solution that will be faster and will require weaker assumptions for exact recovery than PCP. This is made precise in Corollary IV.1. We also show extensive simulation comparisons and some real data comparisons of modified-PCP with PCP and with other existing robust PCA solutions from literature. The implementation requires a fast algorithm for solving the modifiedPCP program. We develop this by modifying the Inexact Augmented Lagrange Multiplier Method of [15] and using the idea of [16], [17] for the sparse recovery step. Notation. For a matrix X, we denote by X∗ the transpose of X; denote by ∥X∥∞ the ℓ∞ norm of X reshaped as a long vector, i.e., maxi,j |Xij |; denote by ∥X∥ the operator norm or 2-norm; denote by ∥X∥F the Frobenius norm. Let I denote the identity operator, i.e., I(Y) = Y for any matrix Y. Let ∥A∥ denote the operator norm of operator A, 2 i.e., ∥A∥ = sup{∥X∥F=1} ∥AX∥F ; let ⟨X,Y⟩ denote the Euclidean inner product between two matrices, i.e., trace(X∗Y); let sgn(X) denote the entrywise sign of X. We let PΘ denote the orthogonal projection onto a linear subspac... |

61 | Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions,”
- Agarwal, Negahban, et al.
- 2012
(Show Context)
Citation Context ...Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see [4], [5], [6], [7], [8], [9], [10], [11], [12], =-=[13]-=-. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. How can we use this information to improve the PCP... |

60 | Spectral norm of random matrices,” - Vu - 2005 |

43 | Two proposals for robust PCA using semidefinite programming,” Electron
- McCoy, Tropp
- 2011
(Show Context)
Citation Context ...ial complexity) robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see [4], [5], =-=[6]-=-, [7], [8], [9], [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. How can we ... |

43 | Robust matrix decomposition with sparse corruptions
- Hsu, Kakade, et al.
- 2011
(Show Context)
Citation Context ... called it principal components’ pursuit (PCP). Here ∥L∥∗ denotes the nuclear norm of L and ∥S∥1 denotes the ℓ1 norm of S reshaped as a long vector. This was among the first recovery guarantees for a practical (polynomial complexity) robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also A shorter version of this paper appears in the proceedings of ISIT 2014 [1]. This work was supported in part by NSF grant CCF-1117125 often called the sparse+low-rank recovery problem, has been studied extensively but theoretically and empirically, e.g. see [2], [5], [6], [7], [8], [9], [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column space of the low rank matrix L. How can we use this information to improve the PCP solution, i.e. allow recovery under weaker assumptions? We propose here a simple but useful modification of the PCP idea, called modified-PCP, that allows us to use this knowledge. We derive its correctness result (Theorem III.1) that provides explicit bounds on the various constants and on the matrix size that are needed to ensure exact recovery with high probability. O... |

40 | Dense error correction via l1-minimization
- Wright, Ma
- 2010
(Show Context)
Citation Context ...ence of outliers is called robust PCA. Outlier is a loosely defined term that refers to any corruption that is not small compared to the true data vector and that occurs occasionally. As suggested in =-=[1]-=-, an outlier can be nicely modeled as a sparse vector. The robust PCA problem occurs in various applications ranging from video analysis to recommender system design in the presence of outliers, e.g. ... |

33 | Incremental gradient on the grassmannian for online foreground and background separation in subsampled video
- He, Balzano, et al.
- 2012
(Show Context)
Citation Context ...subject to the data constraint. Modified-PCP applies a similar idea to the vector of singular values of the low rank matrix. Other recent work on algorithms for recursive / online robust PCA includes =-=[19]-=-, [20], [21], [22], [23], [24]. In [22], [23], two online algorithms for robust PCA (that do not model the outlier as a sparse vector but as a vector that is “far” from the data subspace) have been sh... |

27 | Dense error correction for low-rank matrices via principal component pursuit
- Ganesh, Wright, et al.
(Show Context)
Citation Context ...l (polynomial complexity) robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see =-=[4]-=-, [5], [6], [7], [8], [9], [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. H... |

22 | Dynamic anomalography: Tracking network anomalies via sparsity and low
- Mardani, Mateos, et al.
- 2013
(Show Context)
Citation Context .... Modified-PCP applies a similar idea to the vector of singular values of the low rank matrix. Other recent work on algorithms for recursive / online robust PCA includes [19], [20], [21], [22], [23], =-=[24]-=-. In [22], [23], two online algorithms for robust PCA (that do not model the outlier as a sparse vector but as a vector that is “far” from the data subspace) have been shown to approximate the batch s... |

18 | Recursive robust PCA or recursive sparse recovery in large but structured noise
- Qiu, Vaswani, et al.
- 2013
(Show Context)
Citation Context ... Starting with an initial knowledge of the subspace, the goal is to estimate the subspace spanned by ℓ1, ℓ2, . . . ℓt and to recover the st’s. Assume the following subspace change model introduced in =-=[15]-=-: ℓt = P(t)at where P(t) = Pj for all tj ≤ t < tj+1, j = 0, 1, . . . J . At the change times, Pj changes as Pj = [(Pj−1Rj \ Pj,old) Pj,new] where Pj,new is a n×cj,new basis matrix that satisfies P ∗j,... |

18 | Online robust pca via stochastic optimization
- Feng, Xu, et al.
- 2013
(Show Context)
Citation Context ...a constraint. Modified-PCP applies a similar idea to the vector of singular values of the low rank matrix. Other recent work on algorithms for recursive / online robust PCA includes [19], [20], [21], =-=[22]-=-, [23], [24]. In [22], [23], two online algorithms for robust PCA (that do not model the outlier as a sparse vector but as a vector that is “far” from the data subspace) have been shown to approximate... |

18 | Real-time robust principal components’ pursuit
- Qiu, Vaswani
- 2010
(Show Context)
Citation Context ...1 using Mj . If Sfull satisfies the assumptions of Theorem III.1 and if (8), (9), and (10) hold with n1 = n, n2 = tj+1 − tj , GPCP = [ ], Unew,PCP = Pj and Vnew,PCP = Vj being the right singular vectors of Lj for all j = 1, 2, . . . , J , then, we can recover Lfull and Sfull exactly with probability at least (1− 23n−10)J . When we compare this with modified-PCP, the second and third condition are significantly weaker than those for PCP when cj,new ≪ rj . The first condition is exactly the same when cj,old = 0 and is only slightly stronger as long as cj,old ≪ rj . Discussion w.r.t. ReProCS. In [20], [21], [14], Qiu et al studied the online / recursive robust PCA problem and proposed a novel recursive algorithm called ReProCS. With the subspace change model described above, they also needed the following “slow subspace change” assumption: ∥P ∗j,newℓt∥ is small for sometime after tj and increases gradually. ModifiedPCP does not need this. Moreover, even with perfect initial subspace knowledge, ReProCS cannot achieve exact recovery of st or ℓt while, as shown above, modified-PCP can. On the other hand, ReProCS is a recursive algorithm while modifiedPCP is not; and for highly correlated sup... |

15 |
Principal component pursuit with reduced linear measurements
- Ganesh, Min, et al.
(Show Context)
Citation Context ...xity) robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see [4], [5], [6], [7], =-=[8]-=-, [9], [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. How can we use this i... |

15 | Robust matrix decomposition with outliers,” arXiv:1011.1518
- Hsu, Kakade, et al.
(Show Context)
Citation Context ... robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see [4], [5], [6], [7], [8], =-=[9]-=-, [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. How can we use this inform... |

9 | Petrels: Parallel subspace estimation and tracking by recursive least squares from partial observations.
- Chi, Eldar, et al.
- 2013
(Show Context)
Citation Context ...e) have been shown to approximate the batch solution and do so only asymptotically. Other somewhat related work includes online algorithms for low-rank matrix completion and dictionary learning [25], =-=[26]-=-, [27]. III. PROOF OUTLINE The overall proof approach is similar to that in [2]. The first step involves starting with the KKT conditions and relaxing them to find a set of conditions under which Lnew... |

9 | An online algorithm for separating sparse and low-dimensional signal sequences from their sum,”
- Guo, Qiu, et al.
- 2014
(Show Context)
Citation Context ...g Mj . If Sfull satisfies the assumptions of Theorem III.1 and if (8), (9), and (10) hold with n1 = n, n2 = tj+1 − tj , GPCP = [ ], Unew,PCP = Pj and Vnew,PCP = Vj being the right singular vectors of Lj for all j = 1, 2, . . . , J , then, we can recover Lfull and Sfull exactly with probability at least (1− 23n−10)J . When we compare this with modified-PCP, the second and third condition are significantly weaker than those for PCP when cj,new ≪ rj . The first condition is exactly the same when cj,old = 0 and is only slightly stronger as long as cj,old ≪ rj . Discussion w.r.t. ReProCS. In [20], [21], [14], Qiu et al studied the online / recursive robust PCA problem and proposed a novel recursive algorithm called ReProCS. With the subspace change model described above, they also needed the following “slow subspace change” assumption: ∥P ∗j,newℓt∥ is small for sometime after tj and increases gradually. ModifiedPCP does not need this. Moreover, even with perfect initial subspace knowledge, ReProCS cannot achieve exact recovery of st or ℓt while, as shown above, modified-PCP can. On the other hand, ReProCS is a recursive algorithm while modifiedPCP is not; and for highly correlated support c... |

8 |
Compressive principal component pursuit,” arXiv:1202.4596
- Wright, Ganesh, et al.
(Show Context)
Citation Context ...st PCA algorithm. Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see [4], [5], [6], [7], [8], [9], =-=[10]-=-, [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. How can we use this information ... |

7 |
A novel m-estimator for robust pca,” arXiv:1112.4863v3
- Zhang, Lerman
- 2013
(Show Context)
Citation Context ...lynomial complexity) robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see [4], =-=[5]-=-, [6], [7], [8], [9], [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. How ca... |

7 |
Robust pca via principal component pursuit: A review for a comparative evaluation in video surveillance,”
- Bouwmans, Zahzah
- 2014
(Show Context)
Citation Context ...nts’ pursuit (PCP). Here ∥L∥∗ denotes the nuclear norm of L and ∥S∥1 denotes the ℓ1 norm of S reshaped as a long vector. This was among the first recovery guarantees for a practical (polynomial complexity) robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also A shorter version of this paper appears in the proceedings of ISIT 2014 [1]. This work was supported in part by NSF grant CCF-1117125 often called the sparse+low-rank recovery problem, has been studied extensively but theoretically and empirically, e.g. see [2], [5], [6], [7], [8], [9], [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column space of the low rank matrix L. How can we use this information to improve the PCP solution, i.e. allow recovery under weaker assumptions? We propose here a simple but useful modification of the PCP idea, called modified-PCP, that allows us to use this knowledge. We derive its correctness result (Theorem III.1) that provides explicit bounds on the various constants and on the matrix size that are needed to ensure exact recovery with high probability. Our result is used to argue th... |

6 | Online pca for contaminated data
- Feng, Xu, et al.
- 2013
(Show Context)
Citation Context ...traint. Modified-PCP applies a similar idea to the vector of singular values of the low rank matrix. Other recent work on algorithms for recursive / online robust PCA includes [19], [20], [21], [22], =-=[23]-=-, [24]. In [22], [23], two online algorithms for robust PCA (that do not model the outlier as a sparse vector but as a vector that is “far” from the data subspace) have been shown to approximate the b... |

4 |
Sujay Sanghavi, and Constantine Caramanis. Lowrank matrix recovery from errors and erasures
- Chen, Jalali
(Show Context)
Citation Context ...ithm. Since then, the batch robust PCA problem, or what is now also often called the sparse+low-rank recovery problem, has been studied extensively, e.g. see [4], [5], [6], [7], [8], [9], [10], [11], =-=[12]-=-, [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column subspace of the low rank matrix L. How can we use this information to improve t... |

4 | Robust PCA and subspace tracking from incomplete observations using `0-surrogates
- Hage, Kleinsteuber
(Show Context)
Citation Context ...t to the data constraint. Modified-PCP applies a similar idea to the vector of singular values of the low rank matrix. Other recent work on algorithms for recursive / online robust PCA includes [19], =-=[20]-=-, [21], [22], [23], [24]. In [22], [23], two online algorithms for robust PCA (that do not model the outlier as a sparse vector but as a vector that is “far” from the data subspace) have been shown to... |

3 |
prost: A smoothed lpnorm robust online subspace tracking method for realtime background subtraction in video,” arXiv preprint arXiv:1302.2073,
- Seidel, Hage, et al.
- 2013
(Show Context)
Citation Context ...ipal components’ pursuit (PCP). Here ∥L∥∗ denotes the nuclear norm of L and ∥S∥1 denotes the ℓ1 norm of S reshaped as a long vector. This was among the first recovery guarantees for a practical (polynomial complexity) robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also A shorter version of this paper appears in the proceedings of ISIT 2014 [1]. This work was supported in part by NSF grant CCF-1117125 often called the sparse+low-rank recovery problem, has been studied extensively but theoretically and empirically, e.g. see [2], [5], [6], [7], [8], [9], [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column space of the low rank matrix L. How can we use this information to improve the PCP solution, i.e. allow recovery under weaker assumptions? We propose here a simple but useful modification of the PCP idea, called modified-PCP, that allows us to use this knowledge. We derive its correctness result (Theorem III.1) that provides explicit bounds on the various constants and on the matrix size that are needed to ensure exact recovery with high probability. Our result is used... |

3 | Gosus: Grassmannian online subspace updates with structured-sparsity,”
- Xu, Ithapu, et al.
- 2013
(Show Context)
Citation Context ...omponents’ pursuit (PCP). Here ∥L∥∗ denotes the nuclear norm of L and ∥S∥1 denotes the ℓ1 norm of S reshaped as a long vector. This was among the first recovery guarantees for a practical (polynomial complexity) robust PCA algorithm. Since then, the batch robust PCA problem, or what is now also A shorter version of this paper appears in the proceedings of ISIT 2014 [1]. This work was supported in part by NSF grant CCF-1117125 often called the sparse+low-rank recovery problem, has been studied extensively but theoretically and empirically, e.g. see [2], [5], [6], [7], [8], [9], [10], [11], [12], [13]. Contribution: In this work we study the following problem. Suppose that we have a partial estimate of the column space of the low rank matrix L. How can we use this information to improve the PCP solution, i.e. allow recovery under weaker assumptions? We propose here a simple but useful modification of the PCP idea, called modified-PCP, that allows us to use this knowledge. We derive its correctness result (Theorem III.1) that provides explicit bounds on the various constants and on the matrix size that are needed to ensure exact recovery with high probability. Our result is used to ar... |

2 |
Real-time robust principal components
- Qiu, Vaswani
- 2010
(Show Context)
Citation Context ..., . . . J , the bound of Theorem 2.1 holds with n(1) = n1, n(2) = α, r = rj and µ = µj then, we can recover L and S exactly and in a piecewise batch fashion with probability at least (1− cn−10)J . In =-=[16]-=-, [17], [15], Qiu et al studied the online / recursive robust PCA problem and proposed a novel recursive algorithm called ReProCS. With the subspace change model described above, they also needed the ... |

2 |
A Parrilo, “Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization,”
- Recht, Fazel, et al.
- 2010
(Show Context)
Citation Context ...used in the following proof. Definition V.3 (Sub-gradient [27]). Consider a convex function f : O → R on a convex set of matrices O. A matrix Y is called its sub-gradient at a point X0 ∈ O if f(X)− f(X0) ≥ ⟨Y, (X−X0)⟩. for all X ∈ O. The set of all sub-gradients of f at X0 is denoted by ∂f(X0). It is known [28], [29] that ∂∥Lnew∥∗ = {UnewV∗new +W : PTnewW = 0, ∥W∥ ≤ 1}. and ∂∥S∥1 = {F : PΩF = sgn(S), ∥F∥∞ ≤ 1}. Definition V.4 (Dual norm [8]). The matrix norm ∥ · ∥♡ is said to be dual to matrix norm ∥ · ∥♠ if, for all Y1 ∈ Rn1×n2 , ∥Y1∥♡ = sup∥Y2∥♠≤1⟨Y1,Y2⟩. Proposition V.5 (Proposition 2.1 of [30]). The following pairs of matrix norms are dual to each other: • ∥ · ∥1 and ∥ · ∥∞; • ∥ · ∥∗ and ∥ · ∥; • ∥ · ∥F and ∥ · ∥F . For all these pairs, the following hold. 1) |⟨Y,Z⟩ |≤ ∥Y∥♠∥Z∥♡. 2) Fixing any Y ∈ Rn1×n2 , there exists Z ∈ Rn1×n2 (that depends on Y) such that ⟨Y,Z⟩ = ∥Y∥♠∥Z∥♡. 3) In particular, we can get ⟨Y,Z⟩ = ∥Y∥1∥Z∥∞ by setting Z = sgn(Y), we can get ⟨Y,Z⟩ = ∥Y∥∗∥Z∥ by setting Z = UYV∗Y where UYΣYV ∗ Y is the SVD of Y, and we can get ⟨Y,Z⟩ = ∥Y∥F ∥Z∥F by letting Z = Y. For any matrix Y, we have ∥Y∥2F = trace(Y∗Y) = ∑ i,j |Yij |2 ≤ ( ∑ i,j |Yij |)2 = ∥Y∥21 and ∥Y∥2F = trace(Y∗Y)... |

2 |
A correctness result for online robust pca,”
- Lois, Vaswani
- 2014
(Show Context)
Citation Context ...f them, but the result of DEC has extra shadows in the face estimate. The other algorithms succeed for none of the 24 frames. Both ReProCS and GRASTA assume that the initial subspace estimate is accurate and “slow subspace change” holds, neither of which happen here and this is the reason that neither of them work. RSL does not converge for this data set because the available number of frames is too small. The time taken by each algorithm is shown in Table I. 10 D. Online robust PCA: simulated data comparisons For simulation comparisons for online robust PCA, we generated data as explained in [37]. The data was generated using the model given in Section IV, with n = 256, J = 3, r0 = 40, t0 = 200 and cj,new = 4, cj,old = 4, for each j = 1, 2, 3. The coefficients, at,∗ = P∗j−1ℓt were i.i.d. uniformly distributed in the interval [−γ, γ]; the coefficients along the new directions, at,new := P∗j,newℓt generated i.i.d. uniformly distributed in the interval [−γnew, γnew] (with a γnew ≤ γ) for the first 1700 columns after the subspace change and i.i.d. uniformly distributed in the interval [−γ, γ] after that. We vary the value of γnew; small values mean that “slow subspace change” required by ... |

1 |
and Namrata Vaswani, “Robust pca with partial subspace knowledge,” http://home.engineering.iastate.edu/∼ jzhan/modPCPfull. pdf
- Zhan
(Show Context)
Citation Context ...y at least 1−cn−10(1) for some numerical constant c. Proof: The proof is obtained by adapting the proof techniques of [2] to our problem. We provide a brief outline in Sec III and a complete proof in =-=[14]-=-. C. Discussion For simplifying our discussion, first just assume that range(G) = range(U0) so that rG = r0 and rextra = 0. Also suppose that n(1) = n1 and n(2) = n2, i.e. the matrix has fewer columns... |

1 |
Recursive sparse recovery
- Qiu, Vaswani
- 2011
(Show Context)
Citation Context .... J , the bound of Theorem 2.1 holds with n(1) = n1, n(2) = α, r = rj and µ = µj then, we can recover L and S exactly and in a piecewise batch fashion with probability at least (1− cn−10)J . In [16], =-=[17]-=-, [15], Qiu et al studied the online / recursive robust PCA problem and proposed a novel recursive algorithm called ReProCS. With the subspace change model described above, they also needed the follow... |

1 |
Online pca for contaminated data,” in Adv. Neural Info.
- Feng, Xu, et al.
- 2013
(Show Context)
Citation Context ...s gradually. ModifiedPCP does not need this. Moreover, even with perfect initial subspace knowledge, ReProCS cannot achieve exact recovery of st or ℓt while, as shown above, modified-PCP can. On the other hand, ReProCS is a recursive algorithm while modifiedPCP is not; and for highly correlated support changes of the st’s, ReProCS outperforms modified-PCP (see Sec VI). The reason is that correlated support change results in S also being rank deficient, thus making it difficult to separate it from Lnew by modified-PCP. Discussion w.r.t. the work of Feng et al. Recent work of Feng et. al. [22], [23] provides two asymptotic results for online robust PCA. The first work [22] does not model the outlier as a sparse vector but just as a vector that is “far” from the low-dimensional data subspace. In [23], the authors reformulate the PCP program and use this to develop a recursive algorithm that comes “close” to the PCP solution asymptotically. V. PROOF OF THEOREM III.1: MAIN LEMMAS Our proof adapts the proof approach of [3] to our new problem and the modified-PCP solution. The main new lemma is Lemma V.7 in which we obtain different and weaker conditions on the dual certificate to ensure exac... |

1 |
Lewis, “The mathematics of eigenvalue optimization,”
- S
- 2003
(Show Context)
Citation Context ... conditions on the dual certificate under which Lnew,S is the unique minimizer of (7). This is done in Sec V-E. Next, we use the golfing scheme [26], [3] to construct a dual certificate that satisfies the required conditions (Sec. V-F). C. Basic Facts We state some basic facts which will be used in the following proof. Definition V.3 (Sub-gradient [27]). Consider a convex function f : O → R on a convex set of matrices O. A matrix Y is called its sub-gradient at a point X0 ∈ O if f(X)− f(X0) ≥ ⟨Y, (X−X0)⟩. for all X ∈ O. The set of all sub-gradients of f at X0 is denoted by ∂f(X0). It is known [28], [29] that ∂∥Lnew∥∗ = {UnewV∗new +W : PTnewW = 0, ∥W∥ ≤ 1}. and ∂∥S∥1 = {F : PΩF = sgn(S), ∥F∥∞ ≤ 1}. Definition V.4 (Dual norm [8]). The matrix norm ∥ · ∥♡ is said to be dual to matrix norm ∥ · ∥♠ if, for all Y1 ∈ Rn1×n2 , ∥Y1∥♡ = sup∥Y2∥♠≤1⟨Y1,Y2⟩. Proposition V.5 (Proposition 2.1 of [30]). The following pairs of matrix norms are dual to each other: • ∥ · ∥1 and ∥ · ∥∞; • ∥ · ∥∗ and ∥ · ∥; • ∥ · ∥F and ∥ · ∥F . For all these pairs, the following hold. 1) |⟨Y,Z⟩ |≤ ∥Y∥♠∥Z∥♡. 2) Fixing any Y ∈ Rn1×n2 , there exists Z ∈ Rn1×n2 (that depends on Y) such that ⟨Y,Z⟩ = ∥Y∥♠∥Z∥♡. 3) In particular, w... |