Results 1  10
of
37
Estimation of (near) lowrank matrices with noise and highdimensional scaling
"... We study an instance of highdimensional statistical inference in which the goal is to use N noisy observations to estimate a matrix Θ ∗ ∈ R k×p that is assumed to be either exactly low rank, or “near ” lowrank, meaning that it can be wellapproximated by a matrix with low rank. We consider an Me ..."
Abstract

Cited by 103 (19 self)
 Add to MetaCart
We study an instance of highdimensional statistical inference in which the goal is to use N noisy observations to estimate a matrix Θ ∗ ∈ R k×p that is assumed to be either exactly low rank, or “near ” lowrank, meaning that it can be wellapproximated by a matrix with low rank. We consider an Mestimator based on regularization by the traceornuclearnormovermatrices, andanalyze its performance under highdimensional scaling. We provide nonasymptotic bounds on the Frobenius norm error that hold for a generalclassofnoisyobservationmodels,and apply to both exactly lowrank and approximately lowrank matrices. We then illustrate their consequences for a number of specific learning models, including lowrank multivariate or multitask regression, system identification in vector autoregressive processes, and recovery of lowrank matrices from random projections. Simulations show excellent agreement with the highdimensional scaling of the error predicted by our theory. 1.
High dimensional analysis of semidefinite relaxations for sparse principal component analysis
, 2008
"... Principal component analysis (PCA) is a classical method for dimensionality reduction based on extracting the dominant eigenvectors of the sample covariance matrix. However, PCA is well known to behave poorly in the “large p, small n ” setting, in which the problem dimension p is comparable to or la ..."
Abstract

Cited by 85 (4 self)
 Add to MetaCart
(Show Context)
Principal component analysis (PCA) is a classical method for dimensionality reduction based on extracting the dominant eigenvectors of the sample covariance matrix. However, PCA is well known to behave poorly in the “large p, small n ” setting, in which the problem dimension p is comparable to or larger than the sample size n. This paper studies PCA in this highdimensional regime, but under the additional assumption that the maximal eigenvector is sparse, say with at most k nonzero components. We analyze two computationally tractable methods for recovering the support of this maximal eigenvector: (a) a simple diagonal cutoff method, which transitions from success to failure as a function of the order parameter θdia(n, p, k) = n/[k 2 log(p − k)]; and (b) a more sophisticated semidefinite programming (SDP) relaxation, which succeeds once the order parameter θsdp(n, p, k) = n/[k log(p − k)] is larger than a critical threshold. Our results thus highlight an interesting tradeoff between computational and statistical efficiency in highdimensional inference.
Sparse principal component analysis and iterative thresholding, The Annals of Statistics 41
"... ar ..."
(Show Context)
Minimax rates of estimation for sparse PCA in high dimensions
, 2012
"... We study sparse principal components analysis in the highdimensional setting, where p (the number of variables) can be much larger than n (the number of observations). We prove optimal, nonasymptotic lower and upper bounds on the minimax estimation error for the leading eigenvector when it belongs ..."
Abstract

Cited by 28 (3 self)
 Add to MetaCart
(Show Context)
We study sparse principal components analysis in the highdimensional setting, where p (the number of variables) can be much larger than n (the number of observations). We prove optimal, nonasymptotic lower and upper bounds on the minimax estimation error for the leading eigenvector when it belongs to an ℓq ball for q ∈ [0, 1]. Our bounds are sharp in p and n for all q ∈ [0, 1] over a wide class of distributions. The upper bound is obtained by analyzing the performance of ℓqconstrained PCA. In particular, our results provide convergence rates for ℓ1constrained PCA. 1
Truncated Power Method for Sparse Eigenvalue Problems
"... This paper considers the sparse eigenvalue problem, which is to extract dominant (largest) sparse eigenvectors with at most k nonzero components. We propose a simple yet effective solution called truncated power method that can approximately solve the underlying nonconvex optimization problem. A st ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
This paper considers the sparse eigenvalue problem, which is to extract dominant (largest) sparse eigenvectors with at most k nonzero components. We propose a simple yet effective solution called truncated power method that can approximately solve the underlying nonconvex optimization problem. A strong sparse recovery result is proved for the truncated power method, and this theory is our key motivation for developing the new algorithm. The proposed method is tested on applications such as sparse principal component analysis and the densest ksubgraph problem. Extensive experiments on several synthetic and realworld data sets demonstrate the competitive empirical performance of our method.
CONSISTENCY OF RESTRICTED MAXIMUM LIKELIHOOD ESTIMATORS OF PRINCIPAL COMPONENTS
"... In this paper we consider two closely related problems: estimation of eigenvalues and eigenfunctions of the covariance kernel of functional data based on (possibly) irregular measurements, and the problem of estimating the eigenvalues and eigenvectors of the covariance matrix for highdimensional Gau ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
(Show Context)
In this paper we consider two closely related problems: estimation of eigenvalues and eigenfunctions of the covariance kernel of functional data based on (possibly) irregular measurements, and the problem of estimating the eigenvalues and eigenvectors of the covariance matrix for highdimensional Gaussian vectors. In [A geometric approach to maximum likelihood estimation of covariance kernel from sparse irregular longitudinal data (2007)], a restricted maximum likelihood (REML) approach has been developed to deal with the first problem. In this paper, we establish consistency and derive rate of convergence of the REML estimator for the functional data case, under appropriate smoothness conditions. Moreover, we prove that when the number of measurements per sample curve is bounded, under squarederror loss, the rate of convergence of the REML estimators of eigenfunctions is nearoptimal. In the case of Gaussian vectors, asymptotic consistency and an efficient score representation of the estimators are obtained under the assumption that the effective dimension grows at a rate slower than the sample size. These results are derived through an explicit utilization of the intrinsic geometry of the parameter space, which is nonEuclidean. Moreover, the results derived in this paper suggest an asymptotic equivalence between the inference on functional data with dense measurements and that of the highdimensional Gaussian vectors. 1. Introduction. Analysis of functional data
Semiparametric Principal Component Analysis Fang
"... We propose two new principal component analysis methods in this paper utilizing a semiparametric model. The according methods are named Copula Component Analysis (COCA) and Copula PCA. The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
We propose two new principal component analysis methods in this paper utilizing a semiparametric model. The according methods are named Copula Component Analysis (COCA) and Copula PCA. The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are multivariate Gaussian. The COCA and Copula PCA accordingly estimate the leading eigenvectors of the correlation and covariance matrices of the latent Gaussian distribution. The robust nonparametric rankbased correlation coefficient estimator, Spearman’s rho, is exploited in estimation. We prove that, under suitable conditions, although the marginal distributions can be arbitrarily continuous, the COCA and Copula PCA estimators obtain fast estimation rates and are feature selection consistent in the setting where the dimension is nearly exponentially large relative to the sample size. Careful numerical experiments on the synthetic and real data are conducted to back up the theoretical results. We also discuss the relationship with the transelliptical component analysis proposed by Han and Liu (2012). 1
Sparse principal component analysis for high dimensional multivariate time series
 International Conference on Artificial Intelligence and Statistics
, 2013
"... We study sparse principal component analysis for high dimensional vector autoregressive time series under a doubly asymptotic framework, which allows the dimension d to scale with the series length T. We treat the transition matrix of time series as a nuisance parameter and directly apply sparse pri ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
We study sparse principal component analysis for high dimensional vector autoregressive time series under a doubly asymptotic framework, which allows the dimension d to scale with the series length T. We treat the transition matrix of time series as a nuisance parameter and directly apply sparse principal component analysis on multivariate time series as if the data are independent. We provide explicit nonasymptotic rates of convergence for leading eigenvector estimation and extend this result to principal subspace estimation. Our analysis illustrates that the spectral norm of the transition matrix plays an essential role in determining the final rates. We also characterize sufficient conditions under which sparse principal component analysis attains the optimal parametric rate. Our theoretical results are backed up by thorough numerical studies. 1
Nonconvex statistical optimization: Minimaxoptimal sparse pca in polynomial time. Available at arXiv:1408.5352
, 2014
"... Sparse principal component analysis (PCA) involves nonconvex optimization for which the global solution is hard to obtain. To address this issue, one popular approach is convex relaxation. However, such an approach may produce suboptimal estimators due to the relaxation effect. To optimally estimate ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Sparse principal component analysis (PCA) involves nonconvex optimization for which the global solution is hard to obtain. To address this issue, one popular approach is convex relaxation. However, such an approach may produce suboptimal estimators due to the relaxation effect. To optimally estimate sparse principal subspaces, we propose a twostage computational framework named “tighten after relax”: Within the “relax ” stage, we approximately solve a convex relaxation of sparse PCA with early stopping to obtain a desired initial estimator; For the “tighten ” stage, we propose a novel algorithm called sparse orthogonal iteration pursuit (SOAP), which iteratively refines the initial estimator by directly solving the underlying nonconvex problem. A key concept of this twostage framework is the basin of attraction. It represents a local region within which the “tighten ” stage has desired computational and statistical guarantees. We prove that, the initial estimator obtained from the “relax ” stage falls into such a region, and hence SOAP geometrically converges to a principal subspace estimator which is minimaxoptimal within a certain model class. Unlike most existing sparse PCA estimators, our approach applies to the nonspiked covariance models, and adapts to nonGaussianity as well as dependent data settings. Moreover, through analyzing the computational complexity of the two stages, we illustrate an interesting phenomenon: Larger sample size can reduce the total iteration complexity. Our framework motivates a general paradigm for solving many complex statistical problems which involve nonconvex optimization with provable guarantees. 1
Optimal Rates of Convergence of Transelliptical Component Analysis
, 2013
"... Han and Liu (2012) proposed a method named transelliptical component analysis (TCA) for conducting scaleinvariant principal component analysis on high dimensional data with transelliptical distributions. The transelliptical family assumes that the data follow an elliptical distribution after unspec ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Han and Liu (2012) proposed a method named transelliptical component analysis (TCA) for conducting scaleinvariant principal component analysis on high dimensional data with transelliptical distributions. The transelliptical family assumes that the data follow an elliptical distribution after unspecified marginal monotone transformations. In a double asymptotic framework where the dimension d is allowed to increase with the sample size n, Han and Liu (2012) showed that one version of TCA attains a “nearly parametric ” rate of convergence in parameter estimation when the parameter of interest is assumed to be sparse. This paper improves upon their results in two aspects: (i) Under the nonsparse setting (i.e., the parameter of interest is not assumed to be sparse), we show that a version of TCA attains the optimal rate of convergence up to a logarithmic factor; (ii) Under the sparse setting, we also lay out venues to analyze the performance of the TCA estimator proposed in Han and Liu (2012). In particular, we provide a “sign subgaussian condition ” which is sufficient for TCA to attain an improved rate of convergence and verify a subfamily of the transelliptical distributions satisfying this condition.