Results 1 - 10
of
39
Instance-based learning algorithms
- Machine Learning
, 1991
"... Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to ..."
Abstract
-
Cited by 897 (18 self)
- Add to MetaCart
Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several realworld databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.
The CN2 Induction Algorithm
- MACHINE LEARNING
, 1989
"... Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple, comprehensib ..."
Abstract
-
Cited by 682 (6 self)
- Add to MetaCart
Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple, comprehensible production rules in domains where problems of poor description language and/or noise may be present. Implementations of the cn2, id3 and aq algorithms are compared on three medical classification tasks.
Overfitting Avoidance as Bias
, 1992
"... Strategies for increasing predictive accuracy through selective pruning have been widely adopted by researchers in decision tree induction. It is easy to get the impression from research reports that there are statistical reasons for believing that these overfitting avoidance strategies do increase ..."
Abstract
-
Cited by 116 (2 self)
- Add to MetaCart
Strategies for increasing predictive accuracy through selective pruning have been widely adopted by researchers in decision tree induction. It is easy to get the impression from research reports that there are statistical reasons for believing that these overfitting avoidance strategies do increase accuracy and that, as a research community, we are making progress toward developing powerful, general methods for guarding against overfitting in inducing decision trees. In fact, any overfitting avoidance strategy amounts to a form of bias and, as such, may degrade performance instead of improving it. If pruning methods have often proven successful in empirical tests, this is due, not to the methods, but to the choice of test problems. As examples in this article illustrate, overfitting avoidance strategies are not better or worse, but only more or less appropriate to specific application domains. We are not---and cannot be---making progress toward methods both powerful and general. The ...
Further Experimental Evidence against the Utility of Occam's Razor
- Journal of Artificial Intelligence Research
, 1996
"... This paper presents new experimental evidence against the utility of Occam's razor. A systematic procedure is presented for post-processing decision trees produced by C4.5. This procedure was derived by rejecting Occam's razor and instead attending to the assumption that similar objects are likely t ..."
Abstract
-
Cited by 51 (5 self)
- Add to MetaCart
This paper presents new experimental evidence against the utility of Occam's razor. A systematic procedure is presented for post-processing decision trees produced by C4.5. This procedure was derived by rejecting Occam's razor and instead attending to the assumption that similar objects are likely to belong to the same class. It increases a decision tree's complexity without altering the performance of that tree on the training data from which it is inferred. The resulting more complex decision trees are demonstrated to have, on average, for a variety of common learning tasks, higher predictive accuracy than the less complex original decision trees. This result raises considerable doubt about the utility of Occam's razor as it is commonly applied in modern machine learning. 1. Introduction In the fourteenth century William of Occam stated "plurality should not be assumed without necessity". This principle has since become known as Occam's razor. Occam's razor was originally intended a...
Pruning Algorithms for Rule Learning
, 1997
"... Pre-pruning and Post-pruning are two standard techniques for handling noise in decision tree learning. Pre-pruning deals with noise during learning, while post-pruning addresses this problem after an overfitting theory has been learned. We first review several adaptations of pre- and post-pruning te ..."
Abstract
-
Cited by 40 (14 self)
- Add to MetaCart
Pre-pruning and Post-pruning are two standard techniques for handling noise in decision tree learning. Pre-pruning deals with noise during learning, while post-pruning addresses this problem after an overfitting theory has been learned. We first review several adaptations of pre- and post-pruning techniques for separate-and-conquer rule learning algorithms and discuss some fundamental problems. The primary goal of this paper is to show how to solve these problems with two new algorithms that combine and integrate pre- and post-pruning.
Simplifying Decision Trees: A Survey
, 1996
"... Induced decision trees are an extensively-researched solution to classification tasks. For many practical tasks, the trees produced by tree-generation algorithms are not comprehensible to users due to their size and complexity. Although many tree induction algorithms have been shown to produce simpl ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
Induced decision trees are an extensively-researched solution to classification tasks. For many practical tasks, the trees produced by tree-generation algorithms are not comprehensible to users due to their size and complexity. Although many tree induction algorithms have been shown to produce simpler, more comprehensible trees (or data structures derived from trees) with good classification accuracy, tree simplification has usually been of secondary concern relative to accuracy and no attempt has been made to survey the literature from the perspective of simplification. We present a framework that organizes the approaches to tree simplification and summarize and critique the approaches within this framework. The purpose of this survey is to provide researchers and practitioners with a concise overview of tree-simplification approaches and insight into their relative capabilities. In our final discussion, we briefly describe some empirical findings and discuss the application of tree i...
Overcoming the myopia of inductive learning algorithms with RELIEFF
- Applied Intelligence
, 1997
"... . Current inductive machine learning algorithms typically use greedy search with limited lookahead. This prevents them to detect significant conditional dependencies between the attributes that describe training objects. Instead of myopic impurity functions and lookahead, we propose to use RELIEFF, ..."
Abstract
-
Cited by 30 (11 self)
- Add to MetaCart
. Current inductive machine learning algorithms typically use greedy search with limited lookahead. This prevents them to detect significant conditional dependencies between the attributes that describe training objects. Instead of myopic impurity functions and lookahead, we propose to use RELIEFF, an extension of RELIEF developed by Kira and Rendell [10], [11], for heuristic guidance of inductive learning algorithms. We have reimplemented Assistant, a system for top down induction of decision trees, using RELIEFF as an estimator of attributes at each selection step. The algorithm is tested on several artificial and several real world problems and the results are compared with some other well known machine learning algorithms. Excellent results on artificial data sets and two real world problems show the advantage of the presented approach to inductive learning. Keywords: learning from examples, estimating attributes, impurity function, RELIEFF, empirical evaluation 1. Introduction ...
Analysing and Improving the Diagnosis of Ischaemic Heart Disease with Machine Learning
, 1999
"... Ischaemic heart disease is one of the world's most important causes of mortality, so improvements and rationalization of diagnostic procedures would be very useful. The four diagnostic levels consist of evaluation of signs and symptoms of the disease and ECG (electrocardiogram) at rest, sequentia ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
Ischaemic heart disease is one of the world's most important causes of mortality, so improvements and rationalization of diagnostic procedures would be very useful. The four diagnostic levels consist of evaluation of signs and symptoms of the disease and ECG (electrocardiogram) at rest, sequential ECG testing during the controlled exercise, myocardial scintigraphy, and finally coronary angiography (which is considered to be the reference method).
Induction of decision trees using RELIEFF
, 1995
"... In the context of machine learning from examples this paper deals with the problem of estimating the quality of attributes with and without dependencies between them. Greedy search prevents current inductive machine learning algorithms to detect significant dependencies between the attributes. Recen ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
In the context of machine learning from examples this paper deals with the problem of estimating the quality of attributes with and without dependencies between them. Greedy search prevents current inductive machine learning algorithms to detect significant dependencies between the attributes. Recently, Kira and Rendell developed the RELIEF algorithm for estimating the quality of attributes that is able to detect dependencies between attributes. We show strong relation between RELIEF's estimates and impurity functions, that are usually used for heuristic guidance of inductive learning algorithms. We propose to use RELIEFF, an extended version of RELIEF, instead of myopic impurity functions. We have reimplemented Assistant, a system for top down induction of decision trees, using RELIEFF as an estimator of attributes at each selection step. The algorithm is tested on several artificial and several real world problems. Results show the advantage of the presented approach to inductive lea...

