MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Em algorithms for pca and spca (1998) [51 citations — 1 self]

Download:
pdf | ps
by Sam Roweis
in Advances in Neural Information Processing Systems
http://www.cs.toronto.edu/~roweis/papers/empca.ps.gz
Add To MetaCart

Abstract:

I present an expectation-maximization (EM) algorithm for principal component analysis (PCA). The algorithm allows a few eigenvectors and eigenvalues to be extracted from large collections of high dimensional data. It is computationally very efficient in space and time. It also naturally accommodates missing information. I also introduce a new variant of PCA called sensible principal component analysis (SPCA) which defines a proper density model in the data space. Learning for SPCA is also done with an EM algorithm. I report results on synthetic and real data showing that these EM algorithms correctly and efficiently find the leading eigenvectors of the covariance of datasets in a few iterations using up to hundreds of thousands of datapoints in thousands of dimensions. 1 Why EM for PCA? Principal component analysis (PCA) is a widely used dimensionality reduction technique in data analysis. Its popularity comes from three important properties. First, it is the optimal (in terms of mean squared error) linear scheme for compressing a set of high dimensional vectors into a set of lower dimensional vectors and then reconstructing. Second, the model parameters can be computed directly from the data-- for example by diagonalizing the sample covariance. Third, compression and decompression are easy operations to perform given the model parameters-- they require only matrix multiplications. Despite these attractive features however, PCA models have several shortcomings. One is that naive methods for finding the principal component directions have trouble with high dimensional data or large numbers of datapoints. Consider attempting to diagonalize the sample covariance matrix of n vectors in a space of p dimensions when n and p are several hundred or several thousand. Difficulties can arise both in the form of computational complexity and also data scarcity. 1 Even computing the sample covariance itself is very costly, requiring O(np

Citations

4345 Maximum likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977
754 The Algebraic Eigenvalue Problem – Wilkinson - 1965
263 Mixtures of probabilistic principal component analyzers – Tipping, Bishop - 1999
213 Probabilistic Principal Component Analysis – Tipping, Bishop - 1999
153 The EM algorithm for mixtures of factor analyzers – Ghahramani, Hinton - 1996
124 Supervised learning from incomplete data via an EM approach – Ghahramani, Jordan - 1994
94 An Introduction to Latent Variable Models – Everitt - 1984
64 ARPACK USERS GUIDE: Solution of Large Scale Eigenvalue Problems by Implicitly Restarted Arnoldi Methods – Lehoucq, Sorensen, et al. - 1998
52 Chaotic dynamics of coherent structures – Sirovich - 1989