Results 1  10
of
237
Multilinear principal component analysis of tensor objects for recognition
 in Proc. Int. Conf. Pattern Recognit
, 2006
"... Abstract—This paper introduces a multilinear principal component analysis (MPCA) framework for tensor object feature extraction. Objects of interest in many computer vision and pattern recognition applications, such as 2D/3D images and video sequences are naturally described as tensors or multilin ..."
Abstract

Cited by 88 (15 self)
 Add to MetaCart
(Show Context)
Abstract—This paper introduces a multilinear principal component analysis (MPCA) framework for tensor object feature extraction. Objects of interest in many computer vision and pattern recognition applications, such as 2D/3D images and video sequences are naturally described as tensors or multilinear arrays. The proposed framework performs feature extraction by determining a multilinear projection that captures most of the original tensorial input variation. The solution is iterative in nature and it proceeds by decomposing the original problem to a series of multiple projection subproblems. As part of this work, methods for subspace dimensionality determination are proposed and analyzed. It is shown that the MPCA framework discussed in this work supplants existing heterogeneous solutions such as the classical principal component analysis (PCA) and its 2D variant (2D PCA). Finally, a tensor object recognition system is proposed with the introduction of a discriminative tensor feature selection mechanism and a novel classification strategy, and applied to the problem of gait recognition. Results presented here indicate MPCA’s utility as a feature extraction tool. It is shown that even without a fully optimized design, an MPCAbased gait recognition module achieves highly competitive performance and compares favorably to the stateoftheart gait recognizers. Index Terms—Dimensionality reduction, feature extraction, gait recognition, multilinear principal component analysis (MPCA), tensor objects. I.
FINITE SAMPLE APPROXIMATION RESULTS FOR PRINCIPAL COMPONENT ANALYSIS: A MATRIX PERTURBATION APPROACH
"... Principal Component Analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite ..."
Abstract

Cited by 66 (15 self)
 Add to MetaCart
(Show Context)
Principal Component Analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size n, to those of the limiting population PCA as n → ∞. As in machine learning, we present a finite sample theorem which holds with high probability for the closeness between the leading eigenvalue and eigenvector of sample PCA and population PCA under a spiked covariance model. In addition, we also consider the relation between finite sample PCA and the asymptotic results in the joint limit p, n → ∞, with p/n = c. We present a matrix perturbation view of the “phase transition phenomenon”, and a simple linearalgebra based derivation of the eigenvalue and eigenvector overlap in this asymptotic limit. Moreover, our analysis also applies for finite p, n where we show that although there is no sharp phase transition as in the infinite case, either as a function of noise level or as a function of sample size n, the eigenvector of sample PCA may exhibit a sharp ”loss of tracking”, suddenly losing its relation to the (true) eigenvector of the population PCA matrix. This occurs due to a crossover between the eigenvalue due to the signal and the largest eigenvalue due to noise, whose eigenvector points in a random direction.
A novel anomaly detection scheme based on principal component classifier
 in Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop, in conjunction with the Third IEEE International Conference on Data Mining (ICDM’03
, 2003
"... This paper proposes a novel scheme that uses robust principal component classifier in intrusion detection problem where the training data may be unsupervised. Assuming that anomalies can be treated as outliers, an intrusion predictive model is constructed from the major and minor principal component ..."
Abstract

Cited by 64 (5 self)
 Add to MetaCart
(Show Context)
This paper proposes a novel scheme that uses robust principal component classifier in intrusion detection problem where the training data may be unsupervised. Assuming that anomalies can be treated as outliers, an intrusion predictive model is constructed from the major and minor principal components of normal instances. A measure of the difference of an anomaly from the normal instance is the distance in the principal component space. The distance based on the major components that account for 50 % of the total variation and the minor components with eigenvalues less than 0.20 is shown to work well. The experiments with KDD Cup 1999 data demonstrate that our proposed method achieves 98.94 % in recall and 97.89 % in precision with the false alarm rate 0.92 % and outperforms the nearest neighbor method, densitybased local outliers (LOF) approach, and the outlier detection algorithms based on Canberra metric.
Estimation of subspace arrangements with applications in modeling and segmenting mixed data
, 2006
"... Abstract. Recently many scientific and engineering applications have involved the challenging task of analyzing large amounts of unsorted highdimensional data that have very complicated structures. From both geometric and statistical points of view, such unsorted data are considered mixed as differ ..."
Abstract

Cited by 60 (4 self)
 Add to MetaCart
(Show Context)
Abstract. Recently many scientific and engineering applications have involved the challenging task of analyzing large amounts of unsorted highdimensional data that have very complicated structures. From both geometric and statistical points of view, such unsorted data are considered mixed as different parts of the data have significantly different structures which cannot be described by a single model. In this paper we propose to use subspace arrangements—a union of multiple subspaces—for modeling mixed data: each subspace in the arrangement is used to model just a homogeneous subset of the data. Thus, multiple subspaces together can capture the heterogeneous structures within the data set. In this paper, we give a comprehensive introduction to a new approach for the estimation of subspace arrangements. This is known as generalized principal component analysis (GPCA). In particular, we provide a comprehensive summary of important algebraic properties and statistical facts that are crucial for making the inference of subspace arrangements both efficient and robust, even when the given data are corrupted by noise or contaminated with outliers. This new method in many ways improves and generalizes extant methods for modeling or clustering mixed data. There have been successful applications of this new method to many realworld problems in computer vision, image processing, and system identification. In this paper, we will examine several of those representative applications. This paper is intended to be expository in nature. However, in order that this may serve as a more complete reference for both theoreticians and practitioners, we take the liberty of filling in several gaps between the theory and the practice in the existing literature.
Email Surveillance Using Nonnegative Matrix Factorization
, 2005
"... In this study, we apply a nonnegative matrix factorization approach for the extraction and detection of concepts or topics from electronic mail messages. For the publicly released Enron electronic mail collection, we encode sparse termbymessage matrices and use a low rank nonnegative matrix fact ..."
Abstract

Cited by 44 (1 self)
 Add to MetaCart
In this study, we apply a nonnegative matrix factorization approach for the extraction and detection of concepts or topics from electronic mail messages. For the publicly released Enron electronic mail collection, we encode sparse termbymessage matrices and use a low rank nonnegative matrix factorization algorithm to preserve natural data nonnegativity and avoid subtractive basis vector and encoding interactions present in techniques such as principal component analysis. Results in topic detection and message clustering are discussed in the context of published Enron business practices and activities, and benchmarks addressing the computational complexity of our approach are provided. The resulting basis vectors and matrix projections of this approach can be used to identify and monitor underlying semantic features (topics) and message clusters in a general or highlevel way without the need to read individual electronic mail messages.
BICROSSVALIDATION OF THE SVD AND THE NONNEGATIVE MATRIX FACTORIZATION 1
"... This article presents a form of bicrossvalidation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
This article presents a form of bicrossvalidation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of columns, and then predict the left out entries by low rank operations on the retained data. We prove a selfconsistency result expressing the prediction error as a residual from a low rank approximation. Random matrix theory and some empirical results suggest that smaller holdout sets lead to more overfitting, while larger ones are more prone to underfitting. In simulated examples we find that a method leaving out half the rows and half the columns performs well. 1. Introduction. Many
Nonlinear Extraction of Independent Components of Natural Images Using Radial Gaussianization
, 2009
"... We consider the problem of efficiently encoding a signal by transforming it to a new representation whose components are statistically independent. A widely studied linear solution, known as independent component analysis (ICA), exists for the case when the signal is generated as a linear transforma ..."
Abstract

Cited by 30 (5 self)
 Add to MetaCart
We consider the problem of efficiently encoding a signal by transforming it to a new representation whose components are statistically independent. A widely studied linear solution, known as independent component analysis (ICA), exists for the case when the signal is generated as a linear transformation of independent nongaussian sources. Here, we examine a complementary case, in which the source is nongaussian and elliptically symmetric. In this case, no invertible linear transform suffices to decompose the signal into independent components, but we show that a simple nonlinear transformation, which we call radial gaussianization (RG), is able to remove all dependencies. We then examine this methodology in the context of natural image statistics. We first show that distributions of spatially proximal bandpass filter responses are better described as elliptical than as linearly transformed independent sources. Consistent with this, we demonstrate that the reduction in dependency achieved by applying RG to either nearby pairs or blocks of bandpass filter responses is significantly greater than that achieved by ICA. Finally, we show that the RG transformation may be closely approximated by divisive normalization, which has been used to model the nonlinear response properties of visual neurons.
Interday forecasting and intraday updating of call center arrivals
 Manufacturing & Service Operations Management
, 2008
"... Abstract Accurate forecasting of call arrivals is critical for stang and scheduling of a telephone call center. We develop methods for interday and dynamic intraday forecasting of incoming call volumes. Our approach is to treat the intraday call volume proles as a highdimensional vector time series ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
Abstract Accurate forecasting of call arrivals is critical for stang and scheduling of a telephone call center. We develop methods for interday and dynamic intraday forecasting of incoming call volumes. Our approach is to treat the intraday call volume proles as a highdimensional vector time series. We propose to rst reduce the dimensionality by singular value decomposition of the matrix of historical intraday proles and then apply time series and regression techniques. Both interday (or daytoday) dynamics and intraday (or withinday) patterns of call arrivals are taken into account by our approach. Distributional forecasts are also developed. The proposed methods are datadriven, and appear to be robust against model assumptions in our simulation studies. They are shown to be very competitive in outofsample forecast comparisons using two real data sets. Our methods are computationally fast and therefore it is feasible to use them for realtime dynamic forecasting.
A novel incremental principal component analysis and its application for face recognition
 IEEE Transactions on Systems, Man, and Cybernetics (Part B
, 2006
"... Abstract—Principal component analysis (PCA) has been proven to be an efficient method in pattern recognition and image analysis. Recently, PCA has been extensively employed for facerecognition algorithms, such as eigenface and fisherface. The encouraging results have been reported and discussed in t ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Principal component analysis (PCA) has been proven to be an efficient method in pattern recognition and image analysis. Recently, PCA has been extensively employed for facerecognition algorithms, such as eigenface and fisherface. The encouraging results have been reported and discussed in the literature. Many PCAbased facerecognition systems have also been developed in the last decade. However, existing PCAbased facerecognition systems are hard to scale up because of the computational cost and memoryrequirement burden. To overcome this limitation, an incremental approach is usually adopted. Incremental PCA (IPCA) methods have been studied for many years in the machinelearning community. The major limitation of existing IPCA methods is that there is no guarantee on the approximation error. In view of this limitation, this paper proposes a new IPCA method based on the idea of a singular value decomposition (SVD) updating algorithm, namely an SVD updatingbased IPCA (SVDUIPCA) algorithm. In the proposed SVDUIPCA algorithm, we have mathematically proved that the approximation error is bounded. A complexity analysis on the proposed method is also presented. Another characteristic of the proposed SVDUIPCA algorithm is that it can be easily extended to a kernel version. The proposed method has been evaluated using available public databases, namely FERET, AR, and Yale B, and applied to existing facerecognition algorithms. Experimental results show that the difference of the average recognition accuracy between the proposed incremental method and the batchmode method is less than 1%. This implies that the proposed SVDUIPCA method gives a close approximation to the batchmode PCA method. Index Terms—Error analysis, face recognition, incremental principal component analysis (PCA), singular value decomposition (SVD). I.