Results 1 - 10
of
23
Learning Classifiers from Imbalanced Data Based on Biased Minimax Probability Machine
, 2004
"... We consider the problem of the binary classification on imbalanced data, in which nearly all the instances are labelled as one class, while far fewer instances are labelled as the other class, usually the more important class. Traditional machine learning methods seeking an accurate performance over ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
(Show Context)
We consider the problem of the binary classification on imbalanced data, in which nearly all the instances are labelled as one class, while far fewer instances are labelled as the other class, usually the more important class. Traditional machine learning methods seeking an accurate performance over a full range of instances are not suitable to deal with this problem, since they tend to classify all the data into the majority, usually the less important class. Moreover, some current methods have tried to utilize some intermediate factors, e.g., the distribution of the training set, the decision thresholds or the cost matrices, to influence the bias of the classification. However, it remains uncertain whether these methods can improve the performance in a systematic way. In this paper, we propose a novel model named Biased Minimax Probability Machine. Different from previous methods, this model directly controls the worst-case real accuracy of classification of the future data to build up biased classifiers. Hence, it provides a rigorous treatment on imbalanced data. The experimental results on the novel model comparing with those of three competitive methods, i.e., the Naive Bayesian classifier, the k-Nearest Neighbor method, and the decision tree method C4.5, demonstrate the superiority of our novel model.
2 A Hierarchical and Contextual Model for Aerial Image Understanding
"... In this paper we present a novel method for parsing aerial images with a hierarchical and contextual model learned in a statistical framework. We learn hierarchies at the scene and object levels to handle the difficult task of representing scene elements at different scales and add contextual constr ..."
Abstract
-
Cited by 23 (6 self)
- Add to MetaCart
(Show Context)
In this paper we present a novel method for parsing aerial images with a hierarchical and contextual model learned in a statistical framework. We learn hierarchies at the scene and object levels to handle the difficult task of representing scene elements at different scales and add contextual constraints to resolve ambiguities in the scene interpretation. This allows the model to rule out inconsistent detections, like cars on trees, and to verify low probability detections based on their local context, such as small cars in parking lots. We also present a two-step algorithm for parsing aerial images that first detects object-level elements like trees and parking lots using color histograms and bag-ofwords models, and objects like roofs and roads using compositional boosting, a powerful method for finding image structures. We then activate the top-down scene model to prune false positives from the first stage. We learn this scene model in a minimax entropy framework and show unique samples from our prior model, which capture the layout of scene objects. We present experiments showing that hierarchical and contextual information greatly reduces the number of false positives in our results. 1. Introduction and Related
Visual learning by evolutionary feature synthesis
- in Int. Conf. on Machine Learning
, 2003
"... In this paper, we present a novel method for learning complex concepts/hypotheses directly from raw training data. The task addressed here concerns data-driven synthesis of recognition procedures for real-world object recognition task. The method uses linear genetic programming to encode potential s ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
In this paper, we present a novel method for learning complex concepts/hypotheses directly from raw training data. The task addressed here concerns data-driven synthesis of recognition procedures for real-world object recognition task. The method uses linear genetic programming to encode potential solutions expressed in terms of elementary operations, and handles the complexity of the learning task by applying cooperative coevolution to decompose the problem automatically. The training consists in coevolving feature extraction procedures, each being a sequence of elementary image processing and feature extraction operations. Extensive experimental results show that the approach attains competitive performance for 3-D object recognition in real synthetic aperture radar (SAR) imagery. 1.
On Machine Learning, ROC Analysis, and Statistical Tests of Significance
- 16th International Conference on Pattern Recognition, IEEE
, 2002
"... ROC analysis is being used with greater frequency as an evaluation methodology in machine learning and pattern recognition. Researchers have used ANOVA to determine if the results from such analysis are statistically significant. Yet, in the medical decision making community, the prevailing method i ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
(Show Context)
ROC analysis is being used with greater frequency as an evaluation methodology in machine learning and pattern recognition. Researchers have used ANOVA to determine if the results from such analysis are statistically significant. Yet, in the medical decision making community, the prevailing method is LABMRMC. Although this latter method uses ANOVA, before doing so, it applies the Jackknife method to account for case-sample variance. To determine whether these two tests make the same decisions regarding statistical significance, we conducted a Monte Carlo simulation using several problems derived from Gaussian distributions, three machine-learning algorithms, ROC analysis, ANOVA, and LABMRMC. Results suggest that the decisions these tests make are not the same, even for simple problems. Furthermore, the larger issue is that since ANOVA does not account for case-sample variance, one cannot generalize experimental results to the population from which the data were drawn.
Learning and recognition of hand-drawn shapes using generative genetic programming
- EvoWorkshops 2007, volume 4448 of LNCS
, 2007
"... Abstract. We describe a novel method of evolutionary visual learning that uses generative approach for assessing learner’s ability to recognize image contents. Each learner, implemented as a genetic programming individual, processes visual primitives that represent local salient features derived fro ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
Abstract. We describe a novel method of evolutionary visual learning that uses generative approach for assessing learner’s ability to recognize image contents. Each learner, implemented as a genetic programming individual, processes visual primitives that represent local salient features derived from a raw input raster image. In response to that input, the learner produces partial reproduction of the input image, and is evaluated according to the quality of that reproduction. We present the method in detail and verify it experimentally on the real-world task of recognition of hand-drawn shapes. 1
Visual Learning by Evolutionary and Coevolutionary Feature Synthesis
- IEEE Transactions on Evolutionary Computation
, 2007
"... Abstract—In this paper, we present a novel method for learning complex concepts/hypotheses directly from raw training data. The task addressed here concerns data-driven synthesis of recog-nition procedures for real-world object recognition. The method uses linear genetic programming to encode potent ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
(Show Context)
Abstract—In this paper, we present a novel method for learning complex concepts/hypotheses directly from raw training data. The task addressed here concerns data-driven synthesis of recog-nition procedures for real-world object recognition. The method uses linear genetic programming to encode potential solutions expressed in terms of elementary operations, and handles the complexity of the learning task by applying cooperative coevo-lution to decompose the problem automatically at the genotype level. The training coevolves feature extraction procedures, each being a sequence of elementary image processing and computer vision operations applied to input images. Extensive experimental results show that the approach attains competitive performance for three-dimensional object recognition in real synthetic aperture radar imagery. Index Terms—Computer vision (CV), cooperative coevolution (CC), evolutionary computation (EC), machine learning (ML), pat-tern recognition, visual learning. I.
Knowledge reuse in genetic programming applied to visual learning
- in D. Thierens (Ed.), Genetic and Evolutionary Computation Conference GECCO, Association for Computing Machinery
, 2007
"... ABSTRACT We propose a method of knowledge reuse for an ensemble of genetic programming-based learners solving a visual learning task. First, we introduce a visual learning method that uses genetic programming individuals to represent hypotheses. Individuals-hypotheses process image representation c ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
(Show Context)
ABSTRACT We propose a method of knowledge reuse for an ensemble of genetic programming-based learners solving a visual learning task. First, we introduce a visual learning method that uses genetic programming individuals to represent hypotheses. Individuals-hypotheses process image representation composed of visual primitives derived from the training images that contain objects to be recognized. The process of recognition is generative, i.e., an individual is supposed to restore the shape of the processed object by drawing its reproduction on a separate canvas. This canonical method is in following extended with a knowledge reuse mechanism that allows a learner to import genetic material from hypotheses that evolved for the other decision classes (object classes). We compare the performance of the extended approach to the basic method on a real-world tasks of handwritten character recognition, and conclude that knowledge reuse leads to signicant convergence speedup and, more importantly, signicantly reduces the risk of overtting.
Generative Learning of Visual Concepts using Multiobjective Genetic Programming
"... This paper introduces a novel method of visual learning based on Genetic Programming, which evolves a population of individuals (image analysis programs) that process attributed visual primitives derived from raw raster images. The goal is to evolve an image analysis program that correctly recognize ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
This paper introduces a novel method of visual learning based on Genetic Programming, which evolves a population of individuals (image analysis programs) that process attributed visual primitives derived from raw raster images. The goal is to evolve an image analysis program that correctly recognizes the training concept (shape). The approach uses generative evaluation scheme: individuals are rewarded for reproducing the shape of the object being recognized using graphical primitives and elementary background knowledge encoded in prede ned operators. Evolutionary run is driven by a multiobjective tness function to prevent premature convergence and enable e ective exploration of the space of solutions. We present the method in detail and verify it experimentally on the task of learning two visual concepts from examples. Key words: Visual learning, genetic programming, generative pattern recognition, evolutionary synthesis. 1
Genetic programming for cross-task knowledge sharing
- In Genetic and Evolutionary Computation Conference GECCO
, 2007
"... We consider multitask learning of visual concepts within genetic programming (GP) framework. The proposed method evolves a population of GP individuals, with each of them composed of several GP trees that process visual primitives derived from input images. The two main trees are delegated to solvin ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
(Show Context)
We consider multitask learning of visual concepts within genetic programming (GP) framework. The proposed method evolves a population of GP individuals, with each of them composed of several GP trees that process visual primitives derived from input images. The two main trees are delegated to solving two different visual tasks and are allowed to share knowledge with each other by calling the remaining GP trees (subfunctions) included in the same individual. The method is applied to the visual learning task of recognizing simple shapes, using generative approach based on visual primitives, introduced in [17]. We compare this approach to a reference method devoid of knowledge sharing, and conclude that in the worst case cross-task learning performs equally well, and in many cases it leads to significant performance improvements in one or both solved tasks.
Maximizing Sensitivity in Medical Diagnosis Using Biased Minimax Probability Machine
"... Abstract—The challenging task of medical diagnosis based on machine learning techniques requires an inherent bias, i.e., the diagnosis should favor the “ill ” class over the “healthy ” class, since misdiagnosing a patient as a healthy person may delay the therapy and aggravate the illness. Therefore ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Abstract—The challenging task of medical diagnosis based on machine learning techniques requires an inherent bias, i.e., the diagnosis should favor the “ill ” class over the “healthy ” class, since misdiagnosing a patient as a healthy person may delay the therapy and aggravate the illness. Therefore, the objective in this task is not to improve the overall accuracy of the classification, but to focus on improving the sensitivity (the accuracy of the “ill ” class) while maintaining an acceptable specificity (the accuracy of the “healthy ” class). Some current methods adopt roundabout ways to impose a certain bias toward the important class, i.e., they try to utilize some intermediate factors to influence the classification. However, it remains uncertain whether these methods can improve the classification performance systematically. In this paper, by engaging a novel learning tool, the biased minimax probability machine (BMPM), we deal with the issue in a more elegant way and directly achieve the objective of appropriate medical diagnosis. More specifically, the BMPM directly controls the worst case accuracies to incorporate a bias toward the “ill ” class. Moreover, in a distribution-free way, the BMPM derives the decision rule in such a way as to maximize the worst case sensitivity while maintaining an acceptable worst case specificity. By directly controlling the accuracies, the BMPM provides a more rigorous way to handle medical diagnosis; by deriving a distribution-free decision rule, the BMPM distinguishes itself from a large family of classifiers, namely, the generative classifiers, where an assumption on the data distribution is necessary. We evaluate the performance of the model and compare it with three traditional classifiers: the-nearest neighbor, the naive Bayesian, and the C4.5. The test results on two medical datasets, the breast-cancer dataset and the heart disease dataset, show that the BMPM outperforms the other three models. Index Terms—Biased classification, medical diagnosis, minimax probability machine, worst case accuracy. I.