Abstract:
We investigate feature extraction and selection in the framework of automatically classifying forestry images according to their "beauty". We experimentally evaluate three recent advances in feature selection: a wrapper method proposed by A. Ng that is more robust when the number of irrelevant features is large compared to the number of relevant ones, filter methods based on boosting and support vector machines (SVM). As baseline, we use 65 features chosen according to human expertise, which lead to 18.9 % of error when the classifier is SVM. A data-driven method proposed by K. Tieu and P. Viola is used to obtain 46,875 alternative features. Feature selection methods are tested using these data-driven features. The SVM-based filter method achieved 18.9 % of misclassification rate using 10,000 features, and a wrapper method with SVM as classifier led to 21.4 % using only 4 features. We also discuss why, in practice, it is more convenient to use a filter method to preselect features for wrappers than using Ng's wrapper method with all features.
Citations
|
674
|
The Elements of Statistical Learning
– Hastie, Tibshirani, et al.
- 2001
|
|
540
|
Wrappers for Feature Subset Selection
– Kohavi, John
- 1997
|
|
389
|
Improved boosting algorithms using confidence-rated predictions
– Schapire, Singer
- 1998
|
|
347
|
Statistical pattern recognition: A review
– Jain, Duin, et al.
- 2000
|
|
247
|
Selection of relevant features and examples in machine learning
– Blum, Langley
- 1997
|
|
178
|
Feature selection: Evaluation, application, and small sample performance
– Jain, Zongker
- 1997
|
|
147
|
Boosting image retrieval
– Tieu, Viola
- 2000
|
|
48
|
Correlation-based feature selection for machine learning
– Hall
- 1998
|
|
27
|
On Feature Selection: Learning with Exponentially Many Irrelevant Features as Training Examples
– Ng
- 1998
|
|
21
|
Image representations for object detection using kernel classifiers
– Evgeniou, Pontil, et al.
- 2000
|
|
18
|
Wrappers and a Boosting-Based Hybrid for Feature Selection
– Das, “Filters
- 2001
|
|
5
|
Contribution of Boosting in Wrapper Models
– Sebban, Nock
- 1999
|
|
1
|
and ct al. Scenic beauty estimate of forestry images
– Kalidindi
- 1997
|
|
1
|
Statistical Pattern Recognition
– Wcbb
- 1999
|
|
1
|
Feature selection for high-dimensional gcnomic microarray data
– Xing, Jordan, et al.
- 2001
|