Results 1 - 10
of
71
Instance-based learning algorithms
- Machine Learning
, 1991
"... Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to ..."
Abstract
-
Cited by 897 (18 self)
- Add to MetaCart
Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several realworld databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.
The CN2 Induction Algorithm
- MACHINE LEARNING
, 1989
"... Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple, comprehensib ..."
Abstract
-
Cited by 682 (6 self)
- Add to MetaCart
Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple, comprehensible production rules in domains where problems of poor description language and/or noise may be present. Implementations of the cn2, id3 and aq algorithms are compared on three medical classification tasks.
An analysis of Bayesian classifiers
- IN PROCEEDINGS OF THE TENTH NATIONAL CONFERENCE ON ARTI CIAL INTELLIGENCE
, 1992
"... In this paper we present anaverage-case analysis of the Bayesian classifier, a simple induction algorithm that fares remarkably well on many learning tasks. Our analysis assumes a monotone conjunctive target concept, and independent, noise-free Boolean attributes. We calculate the probability that t ..."
Abstract
-
Cited by 285 (16 self)
- Add to MetaCart
In this paper we present anaverage-case analysis of the Bayesian classifier, a simple induction algorithm that fares remarkably well on many learning tasks. Our analysis assumes a monotone conjunctive target concept, and independent, noise-free Boolean attributes. We calculate the probability that the algorithm will induce an arbitrary pair of concept descriptions and then use this to compute the probability of correct classification over the instance space. The analysis takes into account the number of training instances, the number of attributes, the distribution of these attributes, and the level of class noise. We also explore the behavioral implications of the analysis by presenting
Induction of Selective Bayesian Classifiers
- CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE
, 1994
"... In this paper, we examine previous work on the naive Bayesian classifier and review its limitations, which include a sensitivity to correlated features. We respond to this problem by embedding the naive Bayesian induction scheme within an algorithm that carries out a greedy search through the space ..."
Abstract
-
Cited by 179 (7 self)
- Add to MetaCart
In this paper, we examine previous work on the naive Bayesian classifier and review its limitations, which include a sensitivity to correlated features. We respond to this problem by embedding the naive Bayesian induction scheme within an algorithm that carries out a greedy search through the space of features. We hypothesize that this approach will improve asymptotic accuracy in domains that involve correlated features without reducing the rate of learning in ones that do not. We report experimental results on six natural domains, including comparisons with decision-tree induction, that support these hypotheses. In closing, we discuss other approaches to extending naive Bayesian classifiers and outline some directions for future research.
Concept Learning and the Problem of Small Disjuncts
-
, 1995
"... Ideally, definitions induced from examples should consist of all, and only, disjuncts that are meaningful (e.g., as measured by a statistical significance test) and have a low error rate. Existing inductive systems create definitions that are ideal with regard to large disjuncts, but far from ideal ..."
Abstract
-
Cited by 136 (1 self)
- Add to MetaCart
Ideally, definitions induced from examples should consist of all, and only, disjuncts that are meaningful (e.g., as measured by a statistical significance test) and have a low error rate. Existing inductive systems create definitions that are ideal with regard to large disjuncts, but far from ideal with regard to small disjuncts, where a small (large) disjunct is one that correctly classifies few (many) training examples. The problem with small disjuncts is that many of them have high rates of misclassification, and it is difficult to eliminate the error-prone small disjuncts from a definition without adversely affecting other disjuncts in the definition. Various approaches to this problem are evaluated, including the novel approach of using a bias different than the "maximum generality" bias. This approach, and some others, prove partly successful, but the problem of small disjuncts remains open.
Learning classification trees
- Statistics and Computing
, 1992
"... Algorithms for learning cIassification trees have had successes in ar-tificial intelligence and statistics over many years. This paper outlines how a tree learning algorithm can be derived using Bayesian statis-tics. This iutroduces Bayesian techniques for splitting, smoothing, and tree averaging. T ..."
Abstract
-
Cited by 112 (8 self)
- Add to MetaCart
Algorithms for learning cIassification trees have had successes in ar-tificial intelligence and statistics over many years. This paper outlines how a tree learning algorithm can be derived using Bayesian statis-tics. This iutroduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule is similar to QuinIan’s information gain, while smoothing and averaging replace pruning. Comparative ex-periments with reimplementations of a minimum encoding approach, Quinlan’s C4 (1987) and Breiman et aL’s CART (1984) show the full Bayesian algorithm produces more accurate predictions than versions
Incremental Reduced Error Pruning
, 1994
"... This paper outlines some problems that may occur with Reduced Error Pruning in Inductive Logic Programming , most notably efficiency. Thereafter a new method, Incremental Reduced Error Pruning , is proposed that attempts to address all of these problems. Experiments show that in many noisy domains t ..."
Abstract
-
Cited by 101 (22 self)
- Add to MetaCart
This paper outlines some problems that may occur with Reduced Error Pruning in Inductive Logic Programming , most notably efficiency. Thereafter a new method, Incremental Reduced Error Pruning , is proposed that attempts to address all of these problems. Experiments show that in many noisy domains this method is much more efficient than alternative algorithms, along with a slight gain in accuracy. However, the experiments show as well that the use of this algorithm cannot be recommended for domains with a very specific concept description. OEFAI-TR-94-09 1 Introduction Being able to deal with noisy data is a must for algorithms that are meant to learn concepts in real-world domains. Significant effort has gone into investigating the effect of noisy data on decision tree learning algorithms (see e.g. [Quinlan, 1993, Breiman et al., 1984]). Not surprisingly, noise handling methods have also entered the emerging field of Inductive Logic Programming (ILP) [Muggleton, 1992]. Linus [Lavr...
A Theory of Learning Classification Rules
, 1992
"... The main contributions of this thesis are a Bayesian theory of learning classification rules, the unification and comparison of this theory with some previous theories of learning, and two extensive applications of the theory to the problems of learning class probability trees and bounding error whe ..."
Abstract
-
Cited by 77 (6 self)
- Add to MetaCart
The main contributions of this thesis are a Bayesian theory of learning classification rules, the unification and comparison of this theory with some previous theories of learning, and two extensive applications of the theory to the problems of learning class probability trees and bounding error when learning logical rules. The thesis is motivated by considering some current research issues in machine learning such as bias, overfitting and search, and considering the requirements placed on a learning system when it is used for knowledge acquisition. Basic Bayesian decision theory relevant to the problem of learning classification rules is reviewed, then a Bayesian framework for such learning is presented. The framework has three components: the hypothesis space, the learning protocol, and criteria for successful learning. Several learning protocols are analysed in detail: queries, logical, noisy, uncertain and positive-only examples. The analysis is done by interpreting a protocol as a...
A Comparative Analysis of Methods for Pruning Decision Trees
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1997
"... In this paper, we address the problem of retrospectively pruning decision trees induced from data, according to a topdown approach. This problem has received considerable attention in the areas of pattern recognition and machine learning, and many distinct methods have been proposed in literature. ..."
Abstract
-
Cited by 73 (2 self)
- Add to MetaCart
In this paper, we address the problem of retrospectively pruning decision trees induced from data, according to a topdown approach. This problem has received considerable attention in the areas of pattern recognition and machine learning, and many distinct methods have been proposed in literature. We make a comparative study of six well-known pruning methods with the aim of understanding their theoretical foundations, their computational complexity, and the strengths and weaknesses of their formulation. Comments on the characteristics of each method are empirically supported. In particular, a wide experimentation performed on several data sets leads us to opposite conclusions on the predictive accuracy of simplified trees from some drawn in the literature. We attribute this divergence to differences in experimental designs. Finally, we prove and make use of a property of the reduced error pruning method to obtain an objective evaluation of the tendency to overprune/underprune ob...
An Experimental Comparison of Human and Machine Learning Formalisms
- In Proceedings of the Sixth International Workshop on Machine Learning
, 1989
"... In this paper we describe the results of a set of experiments in which we compared the learning performance of human and machine learning agents. The problem involved the learning of a concept description for deciding on the legality of positions within the chess endgame King and Rook against King. ..."
Abstract
-
Cited by 50 (9 self)
- Add to MetaCart
In this paper we describe the results of a set of experiments in which we compared the learning performance of human and machine learning agents. The problem involved the learning of a concept description for deciding on the legality of positions within the chess endgame King and Rook against King. Various amounts of background knowledge were made available to each learning agent. We concluded that the ability to produce high performance in this domain was almost entirely dependent on the ability to express first-order predicate relationships. 1 Introduction It is a commonly held belief that the use of a restricted hypothesis language simplifies the task of learning. In this paper we investigate a simple problem in which this is not the case. We describe a set of experiments in which a number of different inductive learning agents, with various hypothesis languages, were provided with the same training and test material. In all the experiments described the training and test instances...

