Results 1  10
of
29
A unified model for probabilistic principal surfaces
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... AbstractÐPrincipal curves and surfaces are nonlinear generalizations of principal components and subspaces, respectively. They can provide insightful summary of highdimensional data not typically attainable by classical linear methods. Solutions to several problems, such as proof of existence and c ..."
Abstract

Cited by 61 (6 self)
 Add to MetaCart
AbstractÐPrincipal curves and surfaces are nonlinear generalizations of principal components and subspaces, respectively. They can provide insightful summary of highdimensional data not typically attainable by classical linear methods. Solutions to several problems, such as proof of existence and convergence, faced by the original principal curve formulation have been proposed in the past few years. Nevertheless, these solutions are not generally extensible to principal surfaces, the mere computation of which presents a formidable obstacle. Consequently, relatively few studies of principal surfaces are available. Recently, we proposed the probabilistic principal surface (PPS) to address a number of issues associated with current principal surface algorithms. PPS uses a manifold oriented covariance noise model, based on the generative topographical mapping (GTM), which can be viewed as a parametric formulation of Kohonen's selforganizing map. Building on the PPS, we introduce a unified covariance model that implements PPS … 0< <1†, GTM … ˆ 1†, and the manifoldaligned GTM …>1† by varying the clamping parameter. Then, we comprehensively evaluate the empirical performance (reconstruction error) of PPS, GTM, and the manifoldaligned GTM on three popular benchmark data sets. It is shown in two different comparisons that the PPS outperforms the GTM under identical parameter settings. Convergence of the PPS is found to be identical to that of the GTM and the computational overhead incurred by the PPS decreases to 40 percent or less for more complex manifolds. These results show that the generalized PPS provides a flexible and effective way of obtaining principal surfaces. Index TermsÐPrincipal curve, principal surface, probabilistic, dimensionality reduction, nonlinear manifold, generative topographic mapping. 1
Principal Curves With Bounded Turn
, 2002
"... Principal curves, like principal components, are a tool used in multivariate analysis for ends like feature extraction. Defined in their original form, principal curves need not exist for general distributions. The existence of principal curves with bounded length for any distribution that satisfies ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
Principal curves, like principal components, are a tool used in multivariate analysis for ends like feature extraction. Defined in their original form, principal curves need not exist for general distributions. The existence of principal curves with bounded length for any distribution that satisfies some minimal regularity conditions has been shown. We define principal curves with bounded turn, show that they exist, and present a learning algorithm for them. Principal components are a special case of such curves when the turn is zero.
Local linear regression with adaptive orthogonal fitting for the wind power application
 Statistics and Computing
, 2008
"... Shortterm forecasting of wind generation requires a model of the function for the conversion of meteorological variables (mainly wind speed) to power production. Such a power curve is nonlinear and bounded, in addition to being nonstationary. Local linear regression is an appealing nonparametric a ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Shortterm forecasting of wind generation requires a model of the function for the conversion of meteorological variables (mainly wind speed) to power production. Such a power curve is nonlinear and bounded, in addition to being nonstationary. Local linear regression is an appealing nonparametric approach for power curve estimation, for which the model coefficients can be tracked with recursive Least Squares (LS) methods. This may lead to an inaccurate estimate of the true power curve, owing to the assumption that a noise component is present on the response variable axis only. Therefore, this assumption is relaxed here, by describing a local linear regression with orthogonal fit. Local linear coefficients are defined as those which minimize a weighted Total Least Squares (TLS) criterion. An adaptive estimation method is introduced in order to accommodate nonstationarity. This has the additional benefit of lowering the computational costs of updating local coefficients every time new observations become available. The estimation method is based on tracking the leftmost eigenvector of the augmented covariance matrix. A robustification of the estimation method is also proposed. Simulations on semiartificial datasets (for which the true power curve is available) underline the properties of the proposed regression and related estimation methods. An important result is the significantly higher ability of local polynomial
Principal graphs and manifolds
 IN “HANDBOOK OF RESEARCH ON MACHINE LEARNING APPLICATIONS AND TRENDS: ALGORITHMS, METHODS AND TECHNIQUES
, 2008
"... In many physical statistical, biological and other investigations it is desirable to approximate a system of points by objects of lower dimension and/or complexity. For this purpose, Karl Pearson invented principal component analysis in 1901 and found ‘lines and planes of closest fit to system of po ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
In many physical statistical, biological and other investigations it is desirable to approximate a system of points by objects of lower dimension and/or complexity. For this purpose, Karl Pearson invented principal component analysis in 1901 and found ‘lines and planes of closest fit to system of points’. The famous kmeans algorithm solves the approximation problem too, but by finite sets instead of lines and planes. This chapter gives a brief practical introduction into the methods of construction of general principal objects, i.e. objects embedded in the ‘middle ’ of the multidimensional data set. As a basis, the unifying framework of mean squared distance approximation of finite datasets is selected. Principal graphs and manifolds are constructed as generalisations of principal components and kmeans principal points. For this purpose, the family of expectation/maximisation algorithms with nearest generalisations is presented. Construction of principal graphs with controlled complexity is based on the graph grammar approach.
Parameter Selection for Principal Curves
, 2011
"... Abstract – Principal curves are nonlinear generalizations of the notion of first principal component. Roughly, a principal curve is a parameterized curve in R d which passes through the “middle ” of a data cloud drawn from some unknown probability distribution. Depending on the definition, a princip ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
Abstract – Principal curves are nonlinear generalizations of the notion of first principal component. Roughly, a principal curve is a parameterized curve in R d which passes through the “middle ” of a data cloud drawn from some unknown probability distribution. Depending on the definition, a principal curve relies on some unknown parameters (number of segments, length, turn...) which have to be properly chosen to recover the shape of the data without interpolating. In the present paper, we consider the principal curve problem from an empirical risk minimization perspective and address the parameter selection issue using the point of view of model selection via penalization. We offer oracle inequalities and implement the proposed approaches to recover the hidden structures in both simulated and reallife data. Index terms – Principal curves, parameter selection, model selection, oracle inequality, penalty calibration, slope heuristics. 2010 Mathematics Subject Classification: 62G08, 62G05.
Principal Manifold Learning by Sparse Grids
, 2008
"... In this paper we deal with the construction of lowerdimensional manifolds from highdimensional data which is an important task in data mining, machine learning and statistics. Here, we consider principal manifolds as the minimum of a regularized, nonlinear empirical quantization error functional. ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
In this paper we deal with the construction of lowerdimensional manifolds from highdimensional data which is an important task in data mining, machine learning and statistics. Here, we consider principal manifolds as the minimum of a regularized, nonlinear empirical quantization error functional. For the discretization we use a sparse grid method in latent parameter space. This approach avoids, to some extent, the curse of dimension of conventional grids like in the GTM approach. The arising nonlinear problem is solved by a descent method which resembles the expectation maximization algorithm. We present our sparse grid principal manifold approach, discuss its properties and report on the results of numerical experiments for one, two and threedimensional model problems.
AutoAssociative Models, Nonlinear Principal Component Analysis, Manifolds and Projection Pursuit
"... Autoassociative models have been introduced as a new tool for building nonlinear Principal component analysis (PCA) methods. Such models rely on successive approximations of a dataset by manifolds of increasing dimensions. In this chapter, we propose a precise theoretical comparison between PCA an ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Autoassociative models have been introduced as a new tool for building nonlinear Principal component analysis (PCA) methods. Such models rely on successive approximations of a dataset by manifolds of increasing dimensions. In this chapter, we propose a precise theoretical comparison between PCA and autoassociative models. We also highlight the links between autoassociative models, projection pursuit algorithms, and some neural network approaches. Numerical results are presented on simulated and real datasets.
Developments and Applications of Nonlinear Principal Component Analysis  a Review
"... Although linear principal component analysis (PCA) originates from the work of Sylvester [67] and Pearson [51], the development of nonlinear counterparts has only received attention from the 1980s. Work on nonlinear PCA, or NLPCA, can be divided into the utilization of autoassociative neural networ ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Although linear principal component analysis (PCA) originates from the work of Sylvester [67] and Pearson [51], the development of nonlinear counterparts has only received attention from the 1980s. Work on nonlinear PCA, or NLPCA, can be divided into the utilization of autoassociative neural networks, principal curves and manifolds, kernel approaches or the combination of these approaches. This article reviews existing algorithmic work, shows how a given data set can be examined to determine whether a conceptually more demanding NLPCA model is required and lists developments of NLPCA algorithms. Finally, the paper outlines problem areas and challenges that require future work to mature the NLPCA research field.
Probabilistic AutoAssociative Models and SemiLinear PCA
, 2012
"... Abstract. AutoAssociative models cover a large class of methods used in data analysis. In this paper, we describe the generals properties of these models when the projection component is linear and we propose and test an easy to implement Probabilistic SemiLinear AutoAssociative model in a Gauss ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract. AutoAssociative models cover a large class of methods used in data analysis. In this paper, we describe the generals properties of these models when the projection component is linear and we propose and test an easy to implement Probabilistic SemiLinear AutoAssociative model in a Gaussian setting. We show it is a generalization of the PCA model to the semilinear case. Numerical experiments on simulated datasets and a real astronomical application highlight the interest of this approach. 1.