• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A Methodology for Feature Selection Using Multi-Objective Genetic Algorithms for Handwritten Digit String Recognition, (2003)

by L S Oliveira, R Sabourin, F Bortolozzi, C Y Suen
Venue:IJPRAI,
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 34
Next 10 →

Optimizing nearest neighbour in random subspaces using a multi-objective genetic algorithm,"

by Guillaume Tremblay , Robert Sabourin , Patrick Maupin , Guillaume Etsmtl Tremblay@livia , Robert Ca , Sabourin@etsmtl , Ca - in 17th International Conference on Pattern Recognition - ICPR2004. , 2004
"... Abstract In this work, the authors have evaluated almost 20 millions ensembles of classifiers generated by several methods. Trying to optimize those ensembles based on the nearest neighbours and the random subspaces paradigms, we found that the use of a diversity metric called "ambiguity" ..."
Abstract - Cited by 17 (9 self) - Add to MetaCart
Abstract In this work, the authors have evaluated almost 20 millions ensembles of classifiers generated by several methods. Trying to optimize those ensembles based on the nearest neighbours and the random subspaces paradigms, we found that the use of a diversity metric called "ambiguity" had no better positive impact than plain stochastic search.
(Show Context)

Citation Context

...es have an inherent diversity, would there be any benefit in using the measure of ambiguity as an optimization criteria in order to maximize performance? Can we significantly reduce the number of kNN in the ensemble, and thus reduce the cardinality of the EoC without any loss of performance in generalization? 2. Optimization methods In this work, we analyzed the effect of using a measure of diversity called ambiguity [4] in the optimization of ensembles of kNN generated by the random subspace method. The optimization is done using a multi-objective genetic algorithm and the NIST SD19 database [1]. The data consists of handwritten numerals forming a 10 classes pattern recognition problem. Prior to the optimization, we have experimentally determined that a one-nearest neighbour classifier (k=1) in a subspace of 32 features (out of 132) was the best combination of parameters for our ensembles of kNN. Performance on the Optimization set 86% 88% 90% 92% 94% 96% 98% 0 10000 20000 30000 40000 50000 number of prototypes R ec og ni ti on r at e Ensemble of 100 kNN (k=1) Single kNN (k=1) MLP Figure 1: Accuracy evolution as the training set increases for MLP, kNN and ensemble of 100 kNN in rando...

Multiobjective genetic algorithms to create ensemble of classifiers

by Luiz S. Oliveira, Marisa Morita, Robert Sabourin, Flávio Bortolozzi, Católica Paraná - In Proceedings of the Third International Conference on Evolutionary Multi-Criterion Optimization , 2005
"... Abstract. Feature selection for ensembles has shown to be an effective strategy for ensemble creation due to its ability of producing good subsets of features, which make the classifiers of the ensemble disagree on difficult cases. In this paper we present an ensemble feature selection approach base ..."
Abstract - Cited by 16 (0 self) - Add to MetaCart
Abstract. Feature selection for ensembles has shown to be an effective strategy for ensemble creation due to its ability of producing good subsets of features, which make the classifiers of the ensemble disagree on difficult cases. In this paper we present an ensemble feature selection approach based on a hierarchical multi-objective genetic algorithm. The algorithm operates in two levels. Firstly, it performs feature selection in order to generate a set of classifiers and then it chooses the best team of classifiers. In order to show its robustness, the method is evaluated in two different contexts: supervised and unsupervised feature selection. In the former, we have considered the problem of handwritten digit recognition while in the latter, we took into account the problem of handwritten month word recognition. Experiments and comparisons with classical methods, such as Bagging and Boosting, demonstrated that the proposed methodology brings compelling improvements when classifiers have to work with very low error rates. 1
(Show Context)

Citation Context

...ing and sensitivity towards the weights. It has been demonstrated that feature selection through multi-objective genetic algorithm (MOGA) is a very powerful tool for finding a set of good classifiers =-=[4,14]-=-, since GA is quite effective in rapid global search of large, non-linear and poorly understood spaces [17]. Besides, it can overcome problems such as scaling and sensitivity towards the weights. Kudo...

Sequential genetic search for ensemble feature selection

by Mykola Pechenizkiy, Padraig Cunningham, Alexey Tsymbal, Mykola Pechenizkiy, Pádraig Cunningham - In Proceedings of International Joint Conference on Artificial Intelligence , 2005
"... Ensemble learning constitutes one of the main di-rections in machine learning and data mining. En-sembles allow us to achieve higher accuracy, which is often not achievable with single models. One technique, which proved to be effective for con-structing an ensemble of diverse classifiers, is the us ..."
Abstract - Cited by 14 (2 self) - Add to MetaCart
Ensemble learning constitutes one of the main di-rections in machine learning and data mining. En-sembles allow us to achieve higher accuracy, which is often not achievable with single models. One technique, which proved to be effective for con-structing an ensemble of diverse classifiers, is the use of feature subsets. Among different approaches to ensemble feature selection, genetic search was shown to perform best in many domains. In this paper, a new strategy GAS-SEFS, Genetic Algo-rithm-based Sequential Search for Ensemble Fea-ture Selection, is introduced. Instead of one genetic process, it employs a series of processes, the goal of each of which is to build one base classifier. Ex-periments on 21 data sets are conducted, comparing the new strategy with a previously considered ge-netic strategy for different ensemble sizes and for five different ensemble integration methods. The experiments show that GAS-SEFS, although being more time-consuming, often builds better ensem-bles, especially on data sets with larger numbers of features. 1

An evaluation of overfit control strategies for multi-objective evolutionary optimization

by Paulo V. W. Radtke, Tony Wong, Robert Sabourin - WCCI/IJCNN , 2006
"... Abstract — The optimization of classification systems is often confronted by the solution over-fit problem. Solution over-fit occurs when the optimized classifier memorizes the training data sets instead of producing a general model. This paper compares two validation strategies used to control the ..."
Abstract - Cited by 13 (5 self) - Add to MetaCart
Abstract — The optimization of classification systems is often confronted by the solution over-fit problem. Solution over-fit occurs when the optimized classifier memorizes the training data sets instead of producing a general model. This paper compares two validation strategies used to control the over-fit phenomenon in classifier optimization problems. Both strategies are implemented within the multi-objective NSGA-II and MOMA algorithms to optimize a Projection Distance classifier and a Multiple Layer Perceptron neural network classifier, in both single and ensemble of classifier configurations. Results indicated that the use of a validation stage during the optimization process is superior to validation performed after the optimization process. I.
(Show Context)

Citation Context

...rategy is able to select P1 in Fig. 2 as the most accurate generalization solution. It produces better results than selecting solutions based solely on the accuracy of the optimization data set alone =-=[6]-=-–[8]. In spite of this, the use of a selection data set can not prevent dominated solutions D1 and D2 which has great generalization power from being discarded by the MOGA. The main shortcoming of thi...

N.Vincent. Feature selection combining genetic algorithm and adaboost classi ers

by H. Chouaib, O. Ramos Terrades, S. Tabbone, N. Vincent - In ICPR , 2008
"... This paper presents a fast method using simple genetic algorithms (GAs) for features selection. Unlike traditional approaches using GAs, we have used the combination of Adaboost classifiers to evaluate an individual of the population. So, the fitness function we have used is defined by the error rat ..."
Abstract - Cited by 8 (0 self) - Add to MetaCart
This paper presents a fast method using simple genetic algorithms (GAs) for features selection. Unlike traditional approaches using GAs, we have used the combination of Adaboost classifiers to evaluate an individual of the population. So, the fitness function we have used is defined by the error rate of this combination. This approach has been implemented and tested on the MNIST database and the results confirm the effectiveness and the robustness of the proposed approach. 1
(Show Context)

Citation Context

...ion. Genetic Algorithms (GAs), used for feature selection, and based on the wrapper method [4], need a classifier (SVM, Neural network, Near-Neighbour..) to evaluate each individual of the population =-=[1, 6, 11]-=-. But training classifiers at each iteration of GA is too much time consuming. In this paper, we propose a new feature selection method, based on GA, which avoid classifier training at each iteration ...

Feature Selection Based on Genetic Algorithms for Speaker Recognition

by Maider Zamalloa, Germán Bordel, Luis Javier Rodríguez, Mikel Peñagarikano - Speaker and Language Recognition Workshop, (2006) IEEE
"... The Mel-Frequency Cepstral Coefficients (MFCC) and their derivatives are commonly used as acoustic features for speaker recognition. The issue arises of whether some of those features are redundant or dependent on other features. Probably, not all of them are equally relevant for speaker recognition ..."
Abstract - Cited by 7 (1 self) - Add to MetaCart
The Mel-Frequency Cepstral Coefficients (MFCC) and their derivatives are commonly used as acoustic features for speaker recognition. The issue arises of whether some of those features are redundant or dependent on other features. Probably, not all of them are equally relevant for speaker recognition. Reduced feature sets allow more robust estimates of the model parameters. Also, less computational resources are required, which is crucial for real-time speaker recognition applications using low-resource devices. In this paper, we use feature weighting as an intermediate step towards feature selection. Genetic algorithms are used to find the optimal set of weights for a 38-dimensional feature set, consisting of 12 MFCC, their first and second derivatives, energy and its first derivative. To evaluate each set of weights, speaker recognition errors are counted over a validation dataset. Speaker models are based on empirical distributions of acoustic labels, obtained through vector quantization. On average, weighting acoustic features yields between 15 % and 25 % error reduction in speaker recognition tests. Finally, features are sorted according to their weights, and the K features with greatest average ranks are retained and evaluated. We conclude that combining feature weighting and feature selection allows to reduce costs without degrading performance.1 1.
(Show Context)

Citation Context

...sumption about the properties ofsthe evaluation function. Multiobjective evaluation functionss(e.g. combining the accuracy and the cost of classification) cansbe defined and used in a natural way [6] =-=[7]-=-. GAs can easilysencode decisions about selecting or not selecting features asssequences of boolean values,sallow to smartly explore thesfeature space by retaining those decisions that benefit thescla...

Recent Advances in Handwriting Recognition

by Flávio Bortolozzi, Alceu de Souza Britto Jr., Luiz S. Oliveira, Marisa Morita - DOCUMENT ANALYSIS, EDITORS: UMAPADA PAL, SWAPAN K. PARUI, BIDYUT B. CHAUDHURI, PP.1-30
"... Machine simulation of human reading has been subject of intensive research for the last three decades. This paper presents a summary about the recent advances in terms of character, word, numeral string, and setence recognition. In addition, the main new trends in the field of handwriting recognitio ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Machine simulation of human reading has been subject of intensive research for the last three decades. This paper presents a summary about the recent advances in terms of character, word, numeral string, and setence recognition. In addition, the main new trends in the field of handwriting recognition are discussed and some important contributions are presented.

Asynchronous master-slave parallelization of differential evolution for multi-objective optimization

by Matjaž Depolli , Roman Trobec , Bogdan Filipič - Evolutionary Computation
"... Abstract In this paper, we present AMS-DEMO, an asynchronous master-slave implementation of DEMO, an evolutionary algorithm for multiobjective optimization. AMS-DEMO was designed for solving time-demanding problems efficiently on both homogeneous and heterogeneous parallel computer architectures. T ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Abstract In this paper, we present AMS-DEMO, an asynchronous master-slave implementation of DEMO, an evolutionary algorithm for multiobjective optimization. AMS-DEMO was designed for solving time-demanding problems efficiently on both homogeneous and heterogeneous parallel computer architectures. The algorithm is used as a test case for the asynchronous master-slave parallelization of multiobjective optimization that has not yet been thoroughly investigated. Selection lag is identified as the key property of the parallelization method, which explains how its behavior depends on the type of computer architecture and the number of processors. It is arrived at analytically and from the empirical results. AMS-DEMO is tested on a benchmark problem and a time-demanding industrial optimization problem, on homogeneous and heterogeneous parallel setups, providing performance results for the algorithm and an insight into the parallelization method. A comparison is also performed between AMS-DEMO and generational master-slave DEMO to demonstrate how the asynchronous parallelization method enhances the algorithm and what benefits it brings compared to the synchronous method.
(Show Context)

Citation Context

... the master-slave type without prior modification. The highest efficiency of the master-slave parallelization type can be achieved on computers with homogeneous processors and in problem domains where the fitness evaluation time is long, constant, and independent of the solution. When these criteria are fulfilled, near-linear speedup (Akl, 1997) (speedup that is close to the upper 4 Evolutionary Computation Volume x, Number x AMS Parallelization of DEMO theoretical limit) is possible. The master-slave parallelization is popular with MOEAs, ranging from simple implementations as in the case of Oliveira et al. (2003) where the master runs on a separate processor, and the cases of Radtke et al. (2003) and Nebro and Durillo (2010) where the master node also runs one slave. There are also implementations for heterogeneous computer architectures where load-balancing has to be implemented. Examples can be found in Eberhard et al. (2003) with pool-of-tasks load balancing algorithm, Lim et al. (2007) where a grid-enabled algorithm combines the island model with the master-slave model, Stanley and Mudge (1995), and Talbi and Meunier (2006) with an asynchronous master-slave parallelization of a steady-state algori...

Feature Selection for Ensembles Using the Multi-Objective Optimization Approach

by Luiz S. Oliveira, Marisa Morita, Robert Sabourin , 2006
"... Feature selection for ensembles has shown to be an effective strategy for ensemble creation due to its ability of producing good subsets of features, which make the classifiers of the ensemble disagree on difficult cases. In this paper we present an ensemble feature selection approach based on a hi ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Feature selection for ensembles has shown to be an effective strategy for ensemble creation due to its ability of producing good subsets of features, which make the classifiers of the ensemble disagree on difficult cases. In this paper we present an ensemble feature selection approach based on a hierarchical multi-objective genetic algorithm. The underpinning paradigm is the “overproduce and choose”. The algorithm operates in two levels. Firstly, it performs feature selection in order to generate a set of classifiers and then it chooses the best team of classifiers. In order to show its robustness, the method is evaluated in two different contexts: supervised and unsupervised feature selection. In the former, we have considered the problem of handwritten digit recognition and used three different feature sets and multi-layer perceptron neural networks as classifiers. In the latter, we took into account the problem of handwritten month word recognition and used three different feature sets and hidden Markov models as classifiers. Experiments and comparisons with classical methods, such as Bagging and Boosting, demonstrated that the proposed methodology brings compelling improvements when classifiers have to work with

Reliable Recognition of Handwritten Digits Using A Cascade Ensemble Classifier System and Hybrid Features

by Ping Zhang , 2006
"... 1.1 OCR: the Motivation Optical Character Recognition (OCR) is a branch of pattern recognition, and also a branch of computer vision. OCR has been extensively researched for more than four decades. With the advent of digital computers, many researchers and engineers have been ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
1.1 OCR: the Motivation Optical Character Recognition (OCR) is a branch of pattern recognition, and also a branch of computer vision. OCR has been extensively researched for more than four decades. With the advent of digital computers, many researchers and engineers have been
(Show Context)

Citation Context

...attractive approach to feature selection since they can generally perform quite an effective search of a large, non-linear space [104]. In the handwritten character recognition area, some researchers =-=[51, 86]-=- have developed OCR-oriented criteria or fitness functions, which can alleviate the computation complexity for a given feature number m, (m≤n, n is the number of features initially extracted). However...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University