Results 1  10
of
52
Locally defined principal curves and surfaces
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2011
"... Principal curves are defined as selfconsistent smooth curves passing through the middle of the data, and they have been used in many applications of machine learning as a generalization, dimensionality reduction and a feature extraction tool. We redefine principal curves and surfaces in terms of th ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
Principal curves are defined as selfconsistent smooth curves passing through the middle of the data, and they have been used in many applications of machine learning as a generalization, dimensionality reduction and a feature extraction tool. We redefine principal curves and surfaces in terms of the gradient and the Hessian of the probability density estimate. This provides a geometric understanding of the principal curves and surfaces, as well as a unifying view for clustering, principal curve fitting and manifold learning by regarding those as principal manifolds of different intrinsic dimensionalities. The theory does not impose any particular density estimation method can be used with any density estimator that gives continuous first and second derivatives. Therefore, we first present our principal curve/surface definition without assuming any particular density estimation method. Afterwards, we develop practical algorithms for the commonly used kernel density estimation (KDE) and Gaussian mixture models (GMM). Results of these algorithms are presented in notional data sets as well as real applications with comparisons to other approaches in the principal curve literature. All in all, we present a novel theoretical understanding of principal curves and surfaces, practical algorithms as general purpose machine learning tools, and applications of these algorithms to several practical problems.
Elastic Principal Graphs and Manifolds and their Practical Applications
 COMPUTING
, 2005
"... Principal manifolds serve as useful tool for many practical applications. These manifolds are defined as lines or surfaces passing through “the middle” of data distribution. We propose an algorithm for fast construction of grid approximations of principal manifolds with given topology. It is based o ..."
Abstract

Cited by 17 (8 self)
 Add to MetaCart
Principal manifolds serve as useful tool for many practical applications. These manifolds are defined as lines or surfaces passing through “the middle” of data distribution. We propose an algorithm for fast construction of grid approximations of principal manifolds with given topology. It is based on analogy of principal manifold and elastic membrane. First advantage of this method is a form of the functional to be minimized which becomes quadratic at the step of the vertices position refinement. This makes the algorithm very effective, especially for parallel implementations. Another advantage is that the same algorithmic kernel is applied to construct principal manifolds of different dimensions and topologies. We demonstrate how flexibility of the approach allows numerous adaptive strategies like principal graph constructing, etc. The algorithm is implemented as a C++ package elmap and as a part of standalone data visualization tool VidaExpert, available on the web. We describe the approach and provide several examples of its application with speed performance characteristics.
PRINCIPAL MANIFOLDS AND GRAPHS IN PRACTICE: FROM MOLECULAR BIOLOGY TO DYNAMICAL SYSTEMS
"... We present several applications of nonlinear data modeling, using principal manifolds and principal graphs constructed using the metaphor of elasticity (elastic principal graph approach). These approaches are generalizations of the Kohonen’s selforganizing maps, a class of artificial neural networ ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
(Show Context)
We present several applications of nonlinear data modeling, using principal manifolds and principal graphs constructed using the metaphor of elasticity (elastic principal graph approach). These approaches are generalizations of the Kohonen’s selforganizing maps, a class of artificial neural networks. On several examples we show advantages of using nonlinear objects for data approximation in comparison to the linear ones. We propose four numerical criteria for comparing linear and nonlinear mappings of datasets into the spaces of lower dimension. The examples are taken from comparative political science, from analysis of highthroughput data in molecular biology, from analysis of dynamical systems.
Data Skeletonization via Reeb Graphs
"... Recovering hidden structure from complex and noisy nonlinear data is one of the most fundamental problems in machine learning and statistical inference. While such data is often highdimensional, it is of interest to approximate it with a lowdimensional or even onedimensional space, since many im ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
Recovering hidden structure from complex and noisy nonlinear data is one of the most fundamental problems in machine learning and statistical inference. While such data is often highdimensional, it is of interest to approximate it with a lowdimensional or even onedimensional space, since many important aspects of data are often intrinsically lowdimensional. Furthermore, there are many scenarios where the underlying structure is graphlike, e.g, river/road networks or various trajectories. In this paper, we develop a framework to extract, as well as to simplify, a onedimensional ”skeleton ” from unorganized data using the Reeb graph. Our algorithm is very simple, does not require complex optimizations and can be easily applied to unorganized highdimensional data such as point clouds or proximity graphs. It can also represent arbitrary graph structures in the data. We also give theoretical results to justify our method. We provide a number of experiments to demonstrate the effectiveness and generality of our algorithm, including comparisons to existing methods, such as principal curves. We believe that the simplicity and practicality of our algorithm will help to promote skeleton graphs as a data analysis tool for a broad range of applications. 1
A.: Topological grammars for data approximation
 Applied Mathematics Letters
, 2007
"... A method of topological grammars is proposed for multidimensional data approximation. For data with complex topology we define a principal cubic complex of low dimension and given complexity that gives the best approximation for the dataset. This complex is a generalization of linear and nonlinear ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
(Show Context)
A method of topological grammars is proposed for multidimensional data approximation. For data with complex topology we define a principal cubic complex of low dimension and given complexity that gives the best approximation for the dataset. This complex is a generalization of linear and nonlinear principal manifolds and includes them as particular cases. The problem of optimal principal complex construction is transformed into a series of minimization problems for quadratic functionals. These quadratic functionals have a physically transparent interpretation in terms of elastic energy. For the energy computation, the whole complex is represented as a system of nodes and springs. Topologically, the principal complex is a product of onedimensional continuums (represented by graphs), and the grammars describe how these continuums transform during the process of optimal complex construction. This factorization of the whole process onto onedimensional transformations using minimization of quadratic energy functionals allow us to construct efficient algorithms.
Principal graphs and manifolds
 IN “HANDBOOK OF RESEARCH ON MACHINE LEARNING APPLICATIONS AND TRENDS: ALGORITHMS, METHODS AND TECHNIQUES
, 2008
"... In many physical statistical, biological and other investigations it is desirable to approximate a system of points by objects of lower dimension and/or complexity. For this purpose, Karl Pearson invented principal component analysis in 1901 and found ‘lines and planes of closest fit to system of po ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
In many physical statistical, biological and other investigations it is desirable to approximate a system of points by objects of lower dimension and/or complexity. For this purpose, Karl Pearson invented principal component analysis in 1901 and found ‘lines and planes of closest fit to system of points’. The famous kmeans algorithm solves the approximation problem too, but by finite sets instead of lines and planes. This chapter gives a brief practical introduction into the methods of construction of general principal objects, i.e. objects embedded in the ‘middle ’ of the multidimensional data set. As a basis, the unifying framework of mean squared distance approximation of finite datasets is selected. Principal graphs and manifolds are constructed as generalisations of principal components and kmeans principal points. For this purpose, the family of expectation/maximisation algorithms with nearest generalisations is presented. Construction of principal graphs with controlled complexity is based on the graph grammar approach.
Threestage Handwriting Stroke Extraction Method with
 Hidden Loop Recovery, in 8th ICDAR. 2005: Seoul, Korea
"... A method for extraction of strokes from handwriting characters and graphemes is presented. The method allows the modelling of the original pen tip trajectory close to that perceived by humans, thus allowing its use in writer identification and verification tasks. The method is also capable of iden ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
A method for extraction of strokes from handwriting characters and graphemes is presented. The method allows the modelling of the original pen tip trajectory close to that perceived by humans, thus allowing its use in writer identification and verification tasks. The method is also capable of identifying retraced strokes and recovering hidden loops. Strokes are represented as cubic splines. The method extracts strokes in three stages: vectorisation, merging of skeletal branches and loop recovery, and final adjustment of nearjunction and loop pieces. The evaluation of the method is performed by using its results for structural feature extraction and writer classification based on the features. 1.
Spectral dimensionality reduction
, 2004
"... In this chapter, we study and put under a common framework a number of nonlinear dimensionality reduction methods, such as Locally Linear Embedding, Isomap, Laplacian eigenmaps and kernel PCA, which are based on performing an eigendecomposition (hence the name “spectral”). That framework also incl ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
In this chapter, we study and put under a common framework a number of nonlinear dimensionality reduction methods, such as Locally Linear Embedding, Isomap, Laplacian eigenmaps and kernel PCA, which are based on performing an eigendecomposition (hence the name “spectral”). That framework also includes classical methods such as PCA and metric multidimensional scaling (MDS). It also includes the data transformation step used in spectral clustering. We show that in all of these cases the learning algorithm estimates the principal eigenfunctions of an operator that depends on the unknown data density and on a kernel that is not necessarily positive semidefinite. This helps to generalize some of these algorithms so as to predict an embedding for outofsample examples without having to retrain the model. It also makes it more transparent what these algorithm are minimizing on the empirical data and gives a corresponding notion of generalization error.
Elastic maps and nets for approximating principal manifolds and their application to microarray data visualization
 In this book
"... Summary. Principal manifolds are defined as lines or surfaces passing through “the middle ” of data distribution. Linear principal manifolds (Principal Components Analysis) are routinely used for dimension reduction, noise filtering and data visualization. Recently, methods for constructing nonline ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
Summary. Principal manifolds are defined as lines or surfaces passing through “the middle ” of data distribution. Linear principal manifolds (Principal Components Analysis) are routinely used for dimension reduction, noise filtering and data visualization. Recently, methods for constructing nonlinear principal manifolds were proposed, including our elastic maps approach which is based on a physical analogy with elastic membranes. We have developed a general geometric framework for constructing “principal objects ” of various dimensions and topologies with the simplest quadratic form of the smoothness penalty which allows very effective parallel implementations. Our approach is implemented in three programming languages (C++, Java and Delphi) with two graphical user interfaces (VidaExpert and ViMiDa applications). In this paper we overview the method of elastic maps and present in detail one of its major applications: the visualization of microarray data in bioinformatics. We show that the method of elastic maps outperforms linear PCA in terms of data approximation, representation of betweenpoint distance structure, preservation of local point neighborhood and representing point classes in lowdimensional spaces. Key words: elastic maps, principal manifolds, elastic functional, data analysis, data visualization, surface modeling 1
Nonlinear Spherical Shells for Approximate Principal Curves Skeletonization
"... We present Nonlinear Spherical Shells (NSS) as a noniterative modelfree method for constructing approximate principal curves skeletons in volumes of d dimensional data points. NSS leverages existing modelfree techniques for nonlinear dimension to remove nonlinear artifacts in data. With nonlinear ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
We present Nonlinear Spherical Shells (NSS) as a noniterative modelfree method for constructing approximate principal curves skeletons in volumes of d dimensional data points. NSS leverages existing modelfree techniques for nonlinear dimension to remove nonlinear artifacts in data. With nonlinearities removed and topology preserved, data embedded by such procedures are assumed to have properties amenable to simple skeletonization procedures. Given these assumptions, NSS is able extract points in the “middle ” of the volume data and hierarchically link them into principal curves, or a set of 1manifolds connected at junctions. 1.