Em algorithms for pca and spca (1998) [51 citations — 1 self]
Abstract:
I present an expectation-maximization (EM) algorithm for principal component analysis (PCA). The algorithm allows a few eigenvectors and eigenvalues to be extracted from large collections of high dimensional data. It is computationally very efficient in space and time. It also naturally accommodates missing information. I also introduce a new variant of PCA called sensible principal component analysis (SPCA) which defines a proper density model in the data space. Learning for SPCA is also done with an EM algorithm. I report results on synthetic and real data showing that these EM algorithms correctly and efficiently find the leading eigenvectors of the covariance of datasets in a few iterations using up to hundreds of thousands of datapoints in thousands of dimensions. 1 Why EM for PCA? Principal component analysis (PCA) is a widely used dimensionality reduction technique in data analysis. Its popularity comes from three important properties. First, it is the optimal (in terms of mean squared error) linear scheme for compressing a set of high dimensional vectors into a set of lower dimensional vectors and then reconstructing. Second, the model parameters can be computed directly from the data-- for example by diagonalizing the sample covariance. Third, compression and decompression are easy operations to perform given the model parameters-- they require only matrix multiplications. Despite these attractive features however, PCA models have several shortcomings. One is that naive methods for finding the principal component directions have trouble with high dimensional data or large numbers of datapoints. Consider attempting to diagonalize the sample covariance matrix of n vectors in a space of p dimensions when n and p are several hundred or several thousand. Difficulties can arise both in the form of computational complexity and also data scarcity. 1 Even computing the sample covariance itself is very costly, requiring O(np
Citations
| 4345 | Maximum likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977 |
| 754 | The Algebraic Eigenvalue Problem – Wilkinson - 1965 |
| 263 | Mixtures of probabilistic principal component analyzers – Tipping, Bishop - 1999 |
| 213 | Probabilistic Principal Component Analysis – Tipping, Bishop - 1999 |
| 153 | The EM algorithm for mixtures of factor analyzers – Ghahramani, Hinton - 1996 |
| 124 | Supervised learning from incomplete data via an EM approach – Ghahramani, Jordan - 1994 |
| 94 | An Introduction to Latent Variable Models – Everitt - 1984 |
| 64 | ARPACK USERS GUIDE: Solution of Large Scale Eigenvalue Problems by Implicitly Restarted Arnoldi Methods – Lehoucq, Sorensen, et al. - 1998 |
| 52 | Chaotic dynamics of coherent structures – Sirovich - 1989 |

