Abstract:
Abstract. Lazy learning methods like the k-nearest neighbor classifier require storing the whole training set and may be too costly when this set is large. The condensed nearest neighbor classifier incrementally stores a subset of the sample, thus decreasing storage and computation requirements. We propose to train multiple such subsets and take a vote over them, thus combining predictions from a set of concept descriptions. We investigate two voting schemes: simple voting where voters have equal weight and weighted voting where weights depend on classifiers ’ confidences in their predictions. We consider ways to form such subsets for improved performance: When the training set is small, voting improves performance considerably. If the training set is not small, then voters converge to similar solutions and we do not gain anything by voting. To alleviate this, when the training set is of intermediate size, we use bootstrapping to generate smaller training sets over which we train the voters. When the training set is large, we partition it into smaller, mutually exclusive subsets and then train the voters. Simulation results on six datasets are reported with good results. We give a review of methods for combining multiple learners. The idea of taking a vote over multiple learners can be applied with any type of learning scheme. Key words: lazy learning, nonparametric estimation, k-nearest neighbor, condensed nearest neighbor, voting
Citations
|
4514
|
Statistical Learning Theory
– Vapnik
- 1998
|
|
2961
|
Pattern Classification and Scene Analysis
– Duda, Hart
- 1973
|
|
938
|
Density Estimation for Statistics and Data Analysis
– Silverman
- 1986
|
|
792
|
Instance-Based Learning Algorithms
– Kibler
- 1991
|
|
655
|
UCI repository of machine learning databases. [Machinereadable data repository
– Murphy
- 1993
|
|
635
|
Generalized Additive Models
– Hastie, Tibshirani
- 1990
|
|
569
|
Adaptive Mixture of Local Experts
– Jacobs, Jordan, et al.
- 1991
|
|
421
|
Computational Geometry
– Preparata, Shamos
- 1985
|
|
414
|
Toward memory-based reasoning
– Stanfill, C, et al.
- 1986
|
|
370
|
Stacked Generalization
– Wolpert
- 1992
|
|
368
|
Neural network ensembles
– Hansen, Salamon
- 1990
|
|
308
|
Neural network ensembles, cross validation, and active learning
– Krogh, Vedelsby
- 1995
|
|
232
|
Methods of combining multiple classifiers and their applications to handwriting recognition
– Xu, Krzyzak, et al.
- 1992
|
|
190
|
The Condensed Nearest Neighbor Rule
– Hart
- 1968
|
|
130
|
A Conservation Law for Generalization Performance
– Schaffer
- 1994
|
|
96
|
Combining the results of several neural network classi�ers
– Rogova
- 1994
|
|
93
|
Improving performance in neural networks using a boosting algorithm
– Drucker, Schapire, et al.
- 1993
|
|
89
|
The Reduced Nearest Neighbor Rule
– Gates
- 1972
|
|
80
|
Improving regression estimation: Averaging methods for variance reduction with extensions to general convex measure optimization
– Perrone
- 1993
|
|
76
|
Efficient Algorithms with Neural Network Behaviour
– Omohundro
- 1987
|
|
60
|
Combining estimates in regression and classification
– LeBlanc, Tibshirani
- 1996
|
|
58
|
Combining estimators using non-constant weighting functions
– Tresp, Taniguchi
- 1995
|
|
54
|
Synergy of clustering multiple back propagation networks. Advances in neural information processing systems 2
– Lincoln, Skrzypek
- 1990
|
|
36
|
Consensus theoretic classification methods
– Benediktsson, Swain
- 1992
|
|
24
|
Multiple networks for function learning
– Alpaydin
- 1993
|
|
22
|
Lowering variance of decisions by using artificial neural network portfolios
– Mani
- 1991
|
|
21
|
Applied Nonparametric Regression, Econometric Society Monograph
– Hardle
- 1990
|
|
20
|
Incremental Supervised Learning for Mobile Robot Reactive Control
– Reignier, Hansen, et al.
- 1995
|
|
18
|
GAL: Networks that Grow when they Learn and Shrink when they Forget
– Alpaydin
- 1994
|
|
15
|
Neural Models of Incremental Supervised and Unsupervised Learning
– Alpaydin
- 1990
|
|
11
|
Bias, variance and the combination of estimators: The case of linear least squares
– Meir
- 1994
|
|
2
|
Comparison of Kernel Estimators
– Alpaydın
- 1995
|
|
2
|
Neural Network Based Electronic Nose Using Constructive Algorithms
– Hines, Gianna
- 1993
|
|
1
|
Stacked Regressions, TR-367
– Breiman
- 1992
|