Results 1  10
of
20
Sample Complexity of Dictionary Learning and other Matrix Factorizations
, 2013
"... HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
(Show Context)
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Alternating direction method of multipliers for sparse principal component analysis
 SIAM Review. Available
"... ar ..."
(Show Context)
Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes ∗
, 2012
"... Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract severallinearcombinationsofthevariablesthattogetherexplainthevarianceinthedataasmuch aspossible,whilecontrollingthenumberofnonzeroloadingsinthesecombinations. Inthispaper we consider 8 different optimization fo ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract severallinearcombinationsofthevariablesthattogetherexplainthevarianceinthedataasmuch aspossible,whilecontrollingthenumberofnonzeroloadingsinthesecombinations. Inthispaper we consider 8 different optimization formulations for computing a single sparse loading vector; these are obtained by combining the following factors: we employ two norms for measuring variance (L2, L1) and two sparsityinducing norms (L0, L1), which are used in two different ways (constraint, penalty). Three of our formulations, notably the one with L0 constraint and L1 variance, have not been considered in the literature. We give a unifying reformulation which we propose to solve via a natural alternating maximization (AM) method. We show the the AM method is nontrivially equivalent to GPower (Journée et al; JMLR 11:517–553, 2010) for all our formulations. Besides this, we provide 24 efficient parallel SPCA implementations: 3 codes (multicore, GPU and cluster) for each of the 8 problems. Parallelism in the methods is aimed at i) speeding up computations (our GPU code can be 100 times faster than an efficient serial code written in C++), ii) obtaining solutions explaining more variance and iii) dealing with big data problems (our cluster code is able to solve a 357 GB problem in about a minute).
Sparse Principal Component Analysis with Constraints
 PROCEEDINGS OF THE TWENTYSIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 2012
"... The sparse principal component analysis is a variant of the classical principal component analysis, which finds linear combinations of a small number of features that maximize variance across data. In this paper we propose a methodology for adding two general types of feature grouping constraints in ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
The sparse principal component analysis is a variant of the classical principal component analysis, which finds linear combinations of a small number of features that maximize variance across data. In this paper we propose a methodology for adding two general types of feature grouping constraints into the original sparse PCA optimization procedure. We derive convex relaxations of the considered constraints, ensuring the convexity of the resulting optimization problem. Empirical evaluation on three realworld problems, one in process monitoring sensor networks and two in social networks, serves to illustrate the usefulness of the proposed methodology.
Sparse PCA through Lowrank Approximations
"... We introduce a novel algorithm that computes the ksparse principal component of a positive semidefinite matrix A. Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a lowdimensional eigensubspace of A. We obtain provable approximation guarantees th ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We introduce a novel algorithm that computes the ksparse principal component of a positive semidefinite matrix A. Our algorithm is combinatorial and operates by examining a discrete set of special vectors lying in a lowdimensional eigensubspace of A. We obtain provable approximation guarantees that depend on the spectral profile of the matrix: the faster the eigenvalue decay, the better the quality of our approximation. For example, if the eigenvalues of A follow a powerlaw decay, we obtain a polynomialtime approximation algorithm for any desired accuracy. We implement our algorithm and test it on multiple artificial and real data sets. Due to a feature elimination step, it is possible to perform sparse PCA on data sets consisting of millions of entries in a few minutes. Our experimental evaluation shows that our scheme is nearly optimal while finding very sparse vectors. We compare to the prior state of the art and show that our scheme matches or outperforms previous algorithms in all tested data sets. 1.
Optimal Rates of Convergence of Transelliptical Component Analysis
, 2013
"... Han and Liu (2012) proposed a method named transelliptical component analysis (TCA) for conducting scaleinvariant principal component analysis on high dimensional data with transelliptical distributions. The transelliptical family assumes that the data follow an elliptical distribution after unspec ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Han and Liu (2012) proposed a method named transelliptical component analysis (TCA) for conducting scaleinvariant principal component analysis on high dimensional data with transelliptical distributions. The transelliptical family assumes that the data follow an elliptical distribution after unspecified marginal monotone transformations. In a double asymptotic framework where the dimension d is allowed to increase with the sample size n, Han and Liu (2012) showed that one version of TCA attains a “nearly parametric ” rate of convergence in parameter estimation when the parameter of interest is assumed to be sparse. This paper improves upon their results in two aspects: (i) Under the nonsparse setting (i.e., the parameter of interest is not assumed to be sparse), we show that a version of TCA attains the optimal rate of convergence up to a logarithmic factor; (ii) Under the sparse setting, we also lay out venues to analyze the performance of the TCA estimator proposed in Han and Liu (2012). In particular, we provide a “sign subgaussian condition ” which is sufficient for TCA to attain an improved rate of convergence and verify a subfamily of the transelliptical distributions satisfying this condition.
Efficient Sparse Principal Component Analysis with Secular Backwards Elimination
"... Sparse PCA has become a popular method for creating simple yet informative loading vectors for data analysis. In this paper, we present a greedy backwards elimination algorithm for sparse PCA that is computationally competitive, with complexity O(n2) for matrices with rapidly decaying singular value ..."
Abstract
 Add to MetaCart
(Show Context)
Sparse PCA has become a popular method for creating simple yet informative loading vectors for data analysis. In this paper, we present a greedy backwards elimination algorithm for sparse PCA that is computationally competitive, with complexity O(n2) for matrices with rapidly decaying singular values. We employ novel techniques from numerical linear algebra, including solving secular equations and lowrank matrix approximation preprocessing. Theoretical guarantees are also provided on all sparse principle components for a given deflation procedure. Tests with synthetic data and realworld data sets demonstrate the competitiveness of this algorithm with leading algorithms for sparse PCA. 1
Coordinatedescent for learning orthogonal matrices through Givens rotations
"... Optimizing over the set of orthogonal matrices is a central component in problems like sparsePCA or tensor decomposition. Unfortunately, such optimization is hard since simple operations on orthogonal matrices easily break orthogonality, and correcting orthogonality usually costs a large amount of ..."
Abstract
 Add to MetaCart
(Show Context)
Optimizing over the set of orthogonal matrices is a central component in problems like sparsePCA or tensor decomposition. Unfortunately, such optimization is hard since simple operations on orthogonal matrices easily break orthogonality, and correcting orthogonality usually costs a large amount of computation. Here we propose a framework for optimizing orthogonal matrices, that is the parallel of coordinatedescent in Euclidean spaces. It is based on Givensrotations, a fasttocompute operation that affects a small number of entries in the learned matrix, and preserves orthogonality. We show two applications of this approach: an algorithm for tensor decompositions used in learning mixture models, and an algorithm for sparsePCA. We study the parameter regime where a Givens rotation approach converges faster and achieves a superior model on a genomewide brainwide mRNA expression dataset. 1.
Understanding Large Text Corpora via Sparse Machine Learning
, 2012
"... Sparse machine learning has recently emerged as powerful tool to obtain models of highdimensional data with high degree of interpretability, at low computational cost. The approach has been successfully used in many areas, such as signal and image processing. This paper posits that these methods ca ..."
Abstract
 Add to MetaCart
(Show Context)
Sparse machine learning has recently emerged as powerful tool to obtain models of highdimensional data with high degree of interpretability, at low computational cost. The approach has been successfully used in many areas, such as signal and image processing. This paper posits that these methods can be extremely useful in the analysis of large collections of text documents, without requiring user expertise in machine learning. Our approach relies on three main ingredients: (a) multidocument text summarization and (b) comparative summarization of two corpora, both using sparse regression or classification; (c) sparse principal components and sparse graphical models for unsupervised analysis and visualization of large text corpora. We validate our methods using a corpus of Aviation Safety Reporting System (ASRS) reports and demonstrate that the methods can reveal causal and contributing factors in runway incursions. Furthermore, we show that the methods automatically discover four main tasks that pilots perform during flight, which can aid in further understanding the causal and contributing factors to runway incursions and other drivers for aviation safety incidents. We also provide a comparative study involving other commonly used datasets, and report on the competitiveness of sparse machine learning compared to stateoftheart methods such as Latent Dirichlet Allocation (LDA).
Understanding Large Text Corpora via Sparse Machine Learning
, 2013
"... Abstract: Sparse machine learning has recently emerged as powerful tool to obtain models of highdimensional data with high degree of interpretability, at low computational cost. The approach has been successfully used in many areas, such as signal and image processing. This article posits that thes ..."
Abstract
 Add to MetaCart
Abstract: Sparse machine learning has recently emerged as powerful tool to obtain models of highdimensional data with high degree of interpretability, at low computational cost. The approach has been successfully used in many areas, such as signal and image processing. This article posits that these methods can be extremely useful in the analysis of large collections of text documents, without requiring user expertise in machine learning. Our approach relies on three main ingredients: (i) multidocument text summarization; (ii) comparative summarization of two corpora, both using sparse regression or classification; (iii) sparse principal components and sparse graphical models for unsupervised analysis and visualization of large text corpora. We validate our methods using a corpus of Aviation Safety Reporting System (ASRS) reports and demonstrate that the methods can reveal causal and contributing factors in runway incursions. Furthermore, we show that the methods automatically discover four main tasks that pilots perform during flight, which can aid in further understanding the causal and contributing factors to runway incursions and other drivers for aviation safety incidents. We also provide a comparative study involving other commonly used datasets, and report on the competitiveness of sparse machine learning compared to stateoftheart methods such as latent