by Patrice Latinne, Olivier Debeir, Christine Decaestecker
http://www.ulb.ac.be/polytech/sln/team/odebeir/latinne_paa2002.pdf
Add To MetaCart
Abstract:
Abstract: Several ways of manipulating a training set have shown that weakened classifier combination can improve prediction accuracy. In the present paper, we focus on learning set sampling (Breiman’s Bagging) and random feature subset selections (Ho’s Random Subspaces). We present a combination scheme labelled ‘Bagfs’, in which new learning sets are generated on the basis of both bootstrap replicates and random subspaces. The performances of the three methods (Bagging, Random Subspaces and Bagfs) are compared to the standard Adaboost algorithm. All four methods are assessed by means of a decision-tree inducer (C4.5). In addition, we also study whether the number and the way in which they are created has a significant influence on the performance of their combination. To answer these two questions, we undertook the application of the McNemar test of significance and the Kappa degree-of-agreement. The results, obtained on 23 conventional databases, show that on average, Bagfs exhibits the best agreement between prediction and supervision.
Citations
|
1453
|
Bagging predictors
– Breiman
- 1996
|
|
1133
|
A decision-theoretic generalization of online learning and an application to boosting
– Freund, Schapire
- 1995
|
|
435
|
Nonparametric Statistics for the Behavioral Sciences (2nd Ed
– Siegel, Castellan
- 1988
|
|
232
|
Methods of combining multiple classifiers and their applications to handwriting recognition
– Xu, Krzyzak, et al.
- 1992
|
|
161
|
The random subspace method for constructing decision forests
– Ho
- 1998
|
|
89
|
On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery 1:3
– Salzberg
- 1997
|
|
64
|
Application of majority voting to pattern recognition: an analysis of its behavior and performance
– Lam, Suen
- 1997
|
|
42
|
Applying the weak learning framework to understand and improve C4.5
– Dietterich, Kearns, et al.
- 1996
|
|
41
|
C4.5: programs for machine learning
– JR
- 1998
|
|
33
|
Combining classifiers: A theoretical framework
– Kittler
- 1998
|
|
33
|
Option decision trees with majority votes
– Kohavi, Kunz
- 1997
|
|
29
|
Combinations of weak classifiers
– Ji, Ma
- 1997
|
|
22
|
Classifier combinations: implementations and theoretical issues
– Lam
- 2000
|
|
18
|
Input decimated ensembles: Decorrelation through dimensinality reduction
– Oza, Tumer
- 2001
|
|
16
|
Dynamic classifier selection based on multiple classifier behavior
– Giacinto
|
|
15
|
Random forests - random features
– Breiman
- 1999
|
|
14
|
Some Theoretical Aspects of Boosting in the Presence of Noisy Data
– Jiang
- 2001
|
|
12
|
Classifier combining: analytical results and implications
– Tumer, Ghosh
- 1996
|
|
12
|
Fundamentals of Biostatistics
– Rosner
- 1995
|
|
10
|
Colla AM. Democracy in neural nets: voting schemes for classification. Neural Networks
– Battiti
- 1994
|
|
10
|
Putting it all together: Methods for combining neural networks
– Perrone
- 1994
|
|
5
|
Srihari SN. Decision combination in multiple classifier systems
– TK, JJ
- 1994
|
|
5
|
Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation
– TG
- 1998
|
|
4
|
The strength of weak learnability
– RE
- 1990
|
|
4
|
Tax DMJ. Experiments with classifier combining rules
– RPW
- 2000
|
|
3
|
An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization
– TG
- 2000
|
|
3
|
Generating classifier committees by stochastically selecting both attributes and training examples
– Zheng
- 1998
|
|
2
|
Pazzani MJ. Error reduction through learning multiple descriptions
– KM
- 1996
|
|
1
|
Data complexity analysis for classifier combination
– TK
- 2001
|
|
1
|
A method of combining multiple experts for the recognition of unconstrained handwritten numerals
– YS, CY
- 1995
|
|
1
|
Nearest neighbor classification from multiple feature subsets
– SD
- 1998
|
|
1
|
Merz CJ. Uci respository of machine learning databases. [http://www.ics.uci.edu/mlearn/MLRepository.html
– Blake, Keogh
- 1998
|
|
1
|
Arching classifiers. Annals of statistics 1998; 26:801–849 Methods and Numbers of Weak Decision Trees
– Breiman
|
|
1
|
Fitzpatrick-Lins K. A coefficient of agreement as a measure of thematic classification accuracy. Photogrammetric Engineering and Remote Sensing
– GH
- 1986
|