Results 1  10
of
13
Sparse higherorder principal components analysis
 In Proceedings of 15th International Conference on Artificial Intelligence and Statistics, Canary Islands
, 2012
"... Traditional tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions yield higherorder principal components that have been used to understand tensor data in areas such as neuroimaging, microscopy, chemometrics, and remote sensing. Sparsity in highdimensional matrix fact ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Traditional tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions yield higherorder principal components that have been used to understand tensor data in areas such as neuroimaging, microscopy, chemometrics, and remote sensing. Sparsity in highdimensional matrix factorizations and principal components has been wellstudied exhibiting many benefits; less attention has been given to sparsity in tensor decompositions. We propose two novel tensor decompositions that incorporate sparsity: the Sparse HigherOrder SVD and the Sparse CP Decomposition. The latter solves an ℓ1norm penalized relaxation of the singlefactor CP optimization problem, thereby automatically selecting relevant features for each tensor factor. Through experiments and a scientific data analysis example, we demonstrate the utility of our methods for dimension reduction, feature selection, signal recovery, and exploratory data analysis of highdimensional tensors. 1
Regularized partial least squares with an application to nmr spectroscopy
 Statistical Analysis and Data Mining
"... Highdimensional data common in genomics, proteomics, and chemometrics often contains complicated correlation structures. Recently, partial least squares (PLS) and Sparse PLS methods have gained attention in these areas as dimension reduction techniques in the context of supervised data analysis. W ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Highdimensional data common in genomics, proteomics, and chemometrics often contains complicated correlation structures. Recently, partial least squares (PLS) and Sparse PLS methods have gained attention in these areas as dimension reduction techniques in the context of supervised data analysis. We introduce a framework for Regularized PLS by solving a relaxation of the SIMPLS optimization problem with penalties on the PLS loadings vectors. Our approach enjoys many advantages including flexibility, general penalties, easy interpretation of results, and fast computation in highdimensional settings. We also outline extensions of our methods leading to novel methods for Nonnegative PLS and Generalized PLS, an adaption of PLS for structured data. We demonstrate the utility of our methods through simulations and a case study on proton Nuclear Magnetic Resonance (NMR) spectroscopy data.
Regularized tensor factorizations and higherorder principal components analysis
, 2012
"... Highdimensional tensors or multiway data are becoming prevalent in areas such as biomedical imaging, chemometrics, networking and bibliometrics. Traditional approaches to finding lower dimensional representations of tensor data include flattening the data and applying matrix factorizations such a ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Highdimensional tensors or multiway data are becoming prevalent in areas such as biomedical imaging, chemometrics, networking and bibliometrics. Traditional approaches to finding lower dimensional representations of tensor data include flattening the data and applying matrix factorizations such as principal components analysis (PCA) or employing tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions. The former can lose important structure in the data, while the latter HigherOrder PCA (HOPCA) methods can be problematic in highdimensions with many irrelevant features. We introduce frameworks for sparse tensor factorizations or Sparse HOPCA based on heuristic algorithmic approaches and by solving penalized optimization problems related to the CP decomposition. Extensions of these approaches lead to methods for general regularized tensor factorizations, multiway Functional HOPCA and generalizations of HOPCA for structured data. We illustrate the utility of our methods for dimension reduction, feature selection, and signal recovery on simulated data and multidimensional microarrays and functional MRIs.
L.: Tucker tensor regression and neuroimaging analysis. arXiv preprint arXiv:1304.5637
, 2013
"... Largescale neuroimaging studies have been collecting brain images of study individuals, which take the form of twodimensional, threedimensional, or higher dimensional arrays, also known as tensors. Addressing scientific questions arising from such data demands new regression models that take mul ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Largescale neuroimaging studies have been collecting brain images of study individuals, which take the form of twodimensional, threedimensional, or higher dimensional arrays, also known as tensors. Addressing scientific questions arising from such data demands new regression models that take multidimensional arrays as covariates. Simply turning an image array into a long vector causes extremely high dimensionality that compromises classical regression methods, and, more seriously, destroys the inherent spatial structure of array data that possesses wealth of information. In this article, we propose a family of generalized linear tensor regression models based upon the Tucker decomposition of regression coefficient arrays. Effectively exploiting the low rank structure of tensor covariates brings the ultrahigh dimensionality to a manageable level that leads to efficient estimation. We demonstrate, both numerically that the new model could provide a sound recovery of even high rank signals, and asymptotically that the model is consistently estimating the best Tucker structure approximation to the full array model in the sense of KullbackLiebler distance. The new model is also compared to a recently proposed tensor regression model that relies upon an alternative CANDECOMP/PARAFAC (CP) decomposition. Key Words: CP decomposition; magnetic resonance image; tensor; Tucker decomposition. 1
Location in article
"... Please check your proof carefully and mark all corrections at the appropriate place in the proof (e.g., by using onscreen annotation in the PDF file) or compile them in a separate list. Note: if you opt to annotate the file with software other than Adobe Reader then please also highlight the approp ..."
Abstract
 Add to MetaCart
(Show Context)
Please check your proof carefully and mark all corrections at the appropriate place in the proof (e.g., by using onscreen annotation in the PDF file) or compile them in a separate list. Note: if you opt to annotate the file with software other than Adobe Reader then please also highlight the appropriate place in the PDF file. To ensure fast publication of your paper please return your corrections within 48 hours. For correction or revision of any artwork, please consult
Usage
, 2013
"... Description Functions for computing sparse generalized principal components, including functions ..."
Abstract
 Add to MetaCart
Description Functions for computing sparse generalized principal components, including functions
Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 License Principal Component Analysis and Optimization: A Tutorial
"... Abstract Principal component analysis (PCA) is one of the most widely used multivariate techniques in statistics. It is commonly used to reduce the dimensionality of data in order to examine its underlying structure and the covariance/correlation structure of a set of variables. While singular valu ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Principal component analysis (PCA) is one of the most widely used multivariate techniques in statistics. It is commonly used to reduce the dimensionality of data in order to examine its underlying structure and the covariance/correlation structure of a set of variables. While singular value decomposition provides a simple means for identification of the principal components (PCs) for classical PCA, solutions achieved in this manner may not possess certain desirable properties including robustness, smoothness, and sparsity. In this paper, we present several optimization problems related to PCA by considering various geometric perspectives. New techniques for PCA can be developed by altering the optimization problems to which principal component loadings are the optimal solutions.
LocalAggregate Modeling for BigData via Distributed Optimization: Applications to Neuroimaging
"... Summary: Technological advances have led to a proliferation of structured bigdata that is often collected and stored in a distributed manner. Examples include climate data, social networking data, crime incidence data, and biomedical imaging. We are specifically motivated to build predictive models ..."
Abstract
 Add to MetaCart
(Show Context)
Summary: Technological advances have led to a proliferation of structured bigdata that is often collected and stored in a distributed manner. Examples include climate data, social networking data, crime incidence data, and biomedical imaging. We are specifically motivated to build predictive models for multisubject neuroimaging data based on each subject’s brain imaging scans. This is an ultrahighdimensional problem that consists of a matrix of covariates (brain locations by time points) for each subject; few methods currently exist to fit supervised models directly to this tensor data. We propose a novel modeling and algorithmic strategy to apply generalized linear models (GLMs) to this massive tensor data in which one set of variables is associated with locations. Our method begins by fitting GLMs to each location separately, and then builds an ensemble by blending information across locations through regularization with what we term an aggregating penalty. Our so called, LocalAggregate Model, can be fit in a completely distributed manner over the locations, and thus greatly reduces the computational burden. Furthermore, we propose to select the appropriate model through a novel sequence of faster algorithmic solutions that is similar to regularization paths. We will demonstrate both the computational and predictive modeling advantages of our methods via simulations and an EEG classification problem.