MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION imensionality Reduction Using Genetic Algorithms

Download:
pdf
by Michael L. Raymer, William F. Punch, Anil K. Jain
http://garage.cps.msu.edu/papers/GARAGe00-07-01.pdf
Add To MetaCart

Abstract:

Pattern recognition generally requires that objects be described in terms of a set of measurable features. The selection and quality of the features representing each pattern has a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy. Many current feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and increasing classification efficiency, it does not necessarily reduce the number of features that must be measured, since each new feature may be a linear combination of all of the features in the original pattern vector. Here we present a new approach to feature extraction in which feature selection, feature extraction, and classifier training are performed simultaneously using a genetic algorithm. The genetic algorithm optimizes a vector of feature weights, which are used to scale the individual features in the original pattern vectors in either a linear or a nonlinear fashion. A masking vector is also employed to perform simultaneous selection of a subset of the features. We employ this technique in combination with the k-nearest-neighbor classification rule, and compare the results with classical feature selection and extraction techniques, including sequential floating forward feature selection, and linear discriminant analysis. We also present results for identification of favorable water binding sites on protein surfaces, an important problem in biochemistry and drug design.

Citations

4828 Genetic Algorithms – Goldberg - 1989
2961 Pattern Classification and Scene Analysis – Duba, Hart - 1973
2489 Induction of Decision Trees – Quinlan - 1986
2138 UCI Repository of Machine Learning Databases – Blake, Merz - 1998
533 Nearest neighbor pattern classification – Cover, Hart - 1967
496 The use of multiple measurements in taxonomic problems – Fisher - 1936
279 On the estimation of a probability density function and the mode – PARZEN - 1962
254 The Jackknife, the Bootstrap, and Other Resampling Plans – EFRON - 1982
250 Bootstrap methods: Another look at the jackknife. The Annals of Statistics 7:1–26 – Efron - 1979
235 The Protein Data Bank: a computer-based archival file for macromolecular structures – Bernstein, Koetzle, et al. - 1977
178 Feature selection: Evaluation, application, and small sample performance – Jain, Zongker - 1997
175 Floating search methods in feature selection – Pudil, Novovicova, et al. - 1994
120 An empirical comparison of pattern recognition, neural nets, and machine learning classification methods – Weiss, Kapouleas - 1989
96 Artificial neural networks for feature extraction and multivariate data projection – Mao, Jain
82 A note on genetic algorithms for largescale feature selection – Siedlecki, Sklansky - 1989
62 Dimensionality and sample size considerations in pattern recognition practice – Jain, Chandrasekaran
61 Evolutionary Computation: The Fossil Record – Fogel, editor - 1998
52 Further Research on Feature Selection and Classification using Genetic Algorithms – Punch, Goodman, et al. - 1993
49 Inductive knowledge acquisition: A case study – Quinlan, Compton, et al. - 1987
47 Simulation of genetic systems by automatic digital computers – Fraser - 1957
33 Tanaka , “Selecting Fuzzy if-then Rules for Classification Problems Using Genetic Algorithms – Ishibuchi, Nozaki, et al. - 1995
32 Hybridizing the Genetic Algorithm and the K Nearest Neighbors Classification Algorithm – Kelly, Davis - 1991
28 Comparative study of techniques for large-scale feature selection – Ferri, Pudil, et al. - 1994
24 Bootstrap techniques for error estimation – Jain, Dubes, et al. - 1987
20 Global properties of evolution processes – Bremermann, Rogson, et al. - 1966
17 Decision boundary feature extraction for neural networks – Lee, Landgrebe - 1997
14 Adaptive fuzzy rule-based classification systems – Nozaki, Ishibuchi, et al. - 1996
13 Parsimonious network design and feature selection through node pruning – Mao, Mohiuddin, et al.
12 An Empirical Comparison of – Weiss, Kapouleas - 1989
11 Simultaneous feature scaling and selection using a genetic algorithm – Raymer, Punch, et al. - 1997
10 Simulation of biological evolution and machine learning: I. selection of self-reproducing numeric patterns by data processing machines, effects of hereditary control, mutation type and crossing – Reed, Toombs, et al. - 1967
9 Predicting conserved water-mediated and polar ligand interactions in proteins using a k-nearest-neighbors genetic algorithm – Raymer, Sanschagrin, et al. - 1997
7 Pattern recognition using discriminative feature extraction – Biem, Katagiri, et al. - 1997
7 A User's Guide to GAUCSD. In – Schraudolph, Grefenstette - 1992
7 The assessment of laboratory tests in the diagnosis of acute appendicitis – Marchand, Lente, et al. - 1983
6 The role of structure in antibody cross-reactivity between peptides and folded proteins – Craig, Sanschagrin, et al. - 1998
4 Atomic and residue hydrophilicity in the context of folded protein structures,” Proteins: Str – Kuhn, Swanson, et al. - 1995
3 Identifying the determinants of favorable solvation sites," Protein Engng, submitted – Raymer, Holstius, et al.
2 Computers in the study of evolution – Crosby - 1967