Results 1  10
of
190
Statistical pattern recognition: A review
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques ..."
Abstract

Cited by 1035 (30 self)
 Add to MetaCart
The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have bean receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the wellknown methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.
Curvilinear Component Analysis: A SelfOrganizing Neural Network for Nonlinear Mapping of Data Sets
, 1997
"... We present a new strategy called “curvilinear component analysis” (CCA) for dimensionality reduction and representation of multidimensional data sets. The principle of CCA is a selforganized neural network performing two tasks: vector quantization (VQ) of the submanifold in the data set (input spac ..."
Abstract

Cited by 212 (1 self)
 Add to MetaCart
(Show Context)
We present a new strategy called “curvilinear component analysis” (CCA) for dimensionality reduction and representation of multidimensional data sets. The principle of CCA is a selforganized neural network performing two tasks: vector quantization (VQ) of the submanifold in the data set (input space) and nonlinear projection (P) of these quantizing vectors toward an output space, providing a revealing unfolding of the submanifold. After learning, the network has the ability to continuously map any new point from one space into another: forward mapping of new points in the input space, or backward mapping of an arbitrary position in the output space.
Dimensionality Reduction Using Genetic Algorithms
, 2000
"... Pattern recognition generally requires that objects be described in terms of a set of measurable features. The selection and quality of the features representing each pattern has a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving ..."
Abstract

Cited by 140 (11 self)
 Add to MetaCart
Pattern recognition generally requires that objects be described in terms of a set of measurable features. The selection and quality of the features representing each pattern has a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy. Many current feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and increasing classification efficiency, it does not necessarily reduce the number of features that must be measured, since each new feature may be a linear combination of all of the features in the original pattern vector. Here we present a new approach to feature extraction in which feature selection, feature extraction, and classifier training are performed simultaneously using a genetic algorithm. The genetic algorithm optimizes a vector of feature weights, which are used to scale the individual features in the original pattern vectors in either a linear or a nonlinear fashion. A masking vector is also employed to perform simultaneous selection of a subset of the features. We employ this technique in combination with the knearestneighbor classification rule, and compare the results with classical feature selection and extraction techniques, including sequential floating forward feature selection, and linear discriminant analysis. We also present results for identification of favorable water binding sites on protein surfaces, an important problem in biochemistry and drug design.
SOMBased Data Visualization Methods
 Intelligent Data Analysis
, 1999
"... The SelfOrganizing Map (SOM) is an efficient tool for visualization of multidimensional numerical data. In this paper, an overview and categorization of both old and new methods for the visualization of SOM is presented. The purpose is to give an idea of what kind of information can be acquired fro ..."
Abstract

Cited by 124 (4 self)
 Add to MetaCart
(Show Context)
The SelfOrganizing Map (SOM) is an efficient tool for visualization of multidimensional numerical data. In this paper, an overview and categorization of both old and new methods for the visualization of SOM is presented. The purpose is to give an idea of what kind of information can be acquired from different presentations and how the SOM can best be utilized in exploratory data visualization. Most of the presented methods can also be applied in the more general case of first making a vector quantization (e.g. kmeans) and then a vector projection (e.g. Sammon's mapping).
Data Exploration Using SelfOrganizing Maps
 ACTA POLYTECHNICA SCANDINAVICA: MATHEMATICS, COMPUTING AND MANAGEMENT IN ENGINEERING SERIES NO. 82
, 1997
"... Finding structures in vast multidimensional data sets, be they measurement data, statistics, or textual documents, is difficult and timeconsuming. Interesting, novel relations between the data items may be hidden in the data. The selforganizing map (SOM) algorithm of Kohonen can be used to aid the ..."
Abstract

Cited by 115 (4 self)
 Add to MetaCart
Finding structures in vast multidimensional data sets, be they measurement data, statistics, or textual documents, is difficult and timeconsuming. Interesting, novel relations between the data items may be hidden in the data. The selforganizing map (SOM) algorithm of Kohonen can be used to aid the exploration: the structures in the data sets can be illustrated on special map displays. In this work, the methodology of using SOMs for exploratory data analysis or data mining is reviewed and developed further. The properties of the maps are compared with the properties of related methods intended for visualizing highdimensional multivariate data sets. In a set of case studies the SOM algorithm is applied to analyzing electroencephalograms, to illustrating structures of the standard of living in the world, and to organizing fulltext document collections. Measures are proposed for evaluating the quality of different types of maps in representing a given data set, and for measuring the robu...
Face recognition with radial basis function (RBF) neural networks
 IEEE Transactions on Neural Networks
, 2002
"... Abstract—A general and efficient design approach using a radial basis function (RBF) neural classifier to cope with small training sets of high dimension, which is a problem frequently encountered in face recognition, is presented in this paper. In order to avoid overfitting and reduce the computati ..."
Abstract

Cited by 51 (2 self)
 Add to MetaCart
Abstract—A general and efficient design approach using a radial basis function (RBF) neural classifier to cope with small training sets of high dimension, which is a problem frequently encountered in face recognition, is presented in this paper. In order to avoid overfitting and reduce the computational burden, face features are first extracted by the principal component analysis (PCA) method. Then, the resulting features are further processed by the Fisher’s linear discriminant (FLD) technique to acquire lowerdimensional discriminant patterns. A novel paradigm is proposed whereby data information is encapsulated in determining the structure and initial parameters of the RBF neural classifier before learning takes place. A hybrid learning algorithm is used to train the RBF neural networks so that the dimension of the search space is drastically reduced in the gradient paradigm. Simulation results conducted on the ORL database show that the system achieves excellent performance both in terms of error rates of classification and learning efficiency. Index Terms—Face recognition, Fisher’s linear discriminant, ORL database, principal component analysis, radial basis function (RBF) neural networks, small training sets of high dimension. I.
FeedForward Neural Networks and Topographic Mappings for Exploratory Data Analysis
 Neural Computing and Applications
, 1996
"... A recent novel approach to the visualisation and analysis of datasets, and one which is particularly applicable to those of a high dimension, is discussed in the context of real applications. A feedforward neural network is utilised to effect a topographic, structurepreserving, dimensionreducing ..."
Abstract

Cited by 49 (2 self)
 Add to MetaCart
A recent novel approach to the visualisation and analysis of datasets, and one which is particularly applicable to those of a high dimension, is discussed in the context of real applications. A feedforward neural network is utilised to effect a topographic, structurepreserving, dimensionreducing transformation of the data, with an additional facility to incorporate different degrees of associated subjective information. The properties of this transformation are illustrated on synthetic and real datasets, including the 1992 UK Research Assessment Exercise for funding in higher education. The method is compared and contrasted to established techniques for feature extraction, and related to topographic mappings, the Sammon projection and the statistical field of multidimensional scaling. 1 INTRODUCTION The visualisation and analysis of highdimensional data is a difficult problem and one that may be helpfully viewed in the context of feature extraction, which provides a useful commo...
RelationshipBased Clustering and Visualization for HighDimensional Data Mining
 INFORMS Journal on Computing
, 2002
"... In several reallife datamining... This paper proposes a relationshipbased approach that alleviates both problems, sidestepping the "curseofdimensionality" issue by working in a suitable similarity space instead of the original highdimensional attribute space. This intermediary simil ..."
Abstract

Cited by 44 (10 self)
 Add to MetaCart
(Show Context)
In several reallife datamining... This paper proposes a relationshipbased approach that alleviates both problems, sidestepping the "curseofdimensionality" issue by working in a suitable similarity space instead of the original highdimensional attribute space. This intermediary similarity space can be suitably tailored to satisfy business criteria such as requiring customer clusters to represent comparable amounts of revenue. We apply efficient and scalable graphpartitioningbased clustering techniques in this space. The output from the clustering algorithm is used to reorder the data points so that the resulting permuted similarity matrix can be readily visualized in two dimensions, with clusters showing up as bands. While twodimensional visualization of a similarity matrix is by itself not novel, its combination with the ordersensitive partitioning of a graph that captures the relevant similarity measure between objects provides three powerful properties: (i) the highdimensionality of the data does not affect further processing once the similarity space is formed; (ii) it leads to clusters of (approximately) equal importance, and (iii) related clusters show up adjacent to one another, further facilitating the visualization of results. The visualization is very helpful for assessing and improving clustering. For example, actionable recommendations for splitting or merging of clusters can be easily derived, and it also guides the user toward the right number of clusters
ViSOM a novel method for multivariate data projection and structure visualization
 IEEE Transactions on Neural Networks
, 2002
"... Abstract—When used for visualization of highdimensional data, the selforganizing map (SOM) requires a coloring scheme such as the Umatrix to mark the distances between neurons. Even so, the structures of the data clusters may not be apparent and their shapes are often distorted. In this paper, a ..."
Abstract

Cited by 44 (11 self)
 Add to MetaCart
(Show Context)
Abstract—When used for visualization of highdimensional data, the selforganizing map (SOM) requires a coloring scheme such as the Umatrix to mark the distances between neurons. Even so, the structures of the data clusters may not be apparent and their shapes are often distorted. In this paper, a visualizationinduced SOM (ViSOM) is proposed to overcome these shortcomings. The algorithm constrains and regularizes the interneuron distance with a parameter that controls the resolution of the map. The mapping preserves the interpoint distances of the input data on the map as well as the topology. It produces a graded mesh in the data space such that the distances between mapped data points on the map resemble those in the original space, like in the Sammon mapping. However, unlike the Sammon mapping, the ViSOM can accommodate both training data and new arrivals and is much simpler in computational complexity. Several experimental results and comparisons with other methods are presented. Index Terms—Dimension reduction, multidimensional scaling, multivariate data visualization, nonlinear mapping, selforganizing maps (SOMs). I.