Results 1  10
of
426
Induction of Decision Trees
 MACH. LEARN
, 1986
"... The technology for building knowledgebased systems by inductive inference from examples has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such syste ..."
Abstract

Cited by 4303 (4 self)
 Add to MetaCart
The technology for building knowledgebased systems by inductive inference from examples has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail. Results from recent studies show ways in which the methodology can be modified to deal with information that is noisy and/or incomplete. A reported shortcoming of the basic algorithm is discussed and two means of overcoming it are compared. The paper concludes with illustrations of current research directions.
Learning logical definitions from relations
 MACHINE LEARNING
, 1990
"... This paper describes FOIL, a system that learns Horn clauses from data expressed as relations. FOIL is based on ideas that have proved effective in attributevalue learning systems, but extends them to a firstorder formalism. This new system has been applied successfully to several tasks taken fro ..."
Abstract

Cited by 930 (8 self)
 Add to MetaCart
This paper describes FOIL, a system that learns Horn clauses from data expressed as relations. FOIL is based on ideas that have proved effective in attributevalue learning systems, but extends them to a firstorder formalism. This new system has been applied successfully to several tasks taken from the machine learning literature.
The CN2 Induction Algorithm
 MACHINE LEARNING
, 1989
"... Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple, comprehensib ..."
Abstract

Cited by 884 (6 self)
 Add to MetaCart
(Show Context)
Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple, comprehensible production rules in domains where problems of poor description language and/or noise may be present. Implementations of the cn2, id3 and aq algorithms are compared on three medical classification tasks.
The strength of weak learnability
 Machine Learning
, 1990
"... Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with h ..."
Abstract

Cited by 861 (24 self)
 Add to MetaCart
(Show Context)
Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent. A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error e.
Unsupervised word sense disambiguation rivaling supervised methods
 IN PROCEEDINGS OF THE 33RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 1995
"... This paper presents an unsupervised learning algorithm for sense disambiguation that, when trained on unannotated English text, rivals the performance of supervised techniques that require timeconsuming hand annotations. The algorithm is based on two powerful constraints  that words tend to have ..."
Abstract

Cited by 629 (4 self)
 Add to MetaCart
(Show Context)
This paper presents an unsupervised learning algorithm for sense disambiguation that, when trained on unannotated English text, rivals the performance of supervised techniques that require timeconsuming hand annotations. The algorithm is based on two powerful constraints  that words tend to have one sense per discourse and one sense per collocation  exploited in an iterative bootstrapping procedure. Tested accuracy exceeds 96%.
Simple Heuristics That Make Us Smart
, 2008
"... To survive in a world where knowledge is limited, time is pressing, and deep thought is often an unattainable luxury, decisionmakers must use bounded rationality. In this precis of Simple heuristics that make us smart, we explore fast and frugal heuristics—simple rules for making decisions with re ..."
Abstract

Cited by 417 (12 self)
 Add to MetaCart
To survive in a world where knowledge is limited, time is pressing, and deep thought is often an unattainable luxury, decisionmakers must use bounded rationality. In this precis of Simple heuristics that make us smart, we explore fast and frugal heuristics—simple rules for making decisions with realistic mental resources. These heuristics enable smart choices to be made quickly and with a minimum of information by exploiting the way that information is structured in particular environments. Despite limiting information search and processing, simple heuristics perform comparably to more complex algorithms, particularly when generalizing to new data—simplicity leads to robustness.
Rule Induction with CN2: Some Recent Improvements
, 1991
"... The CN2 algorithm induces an ordered list of classification rules from examples using entropy as its search heuristic. In this short paper, we describe two improvements to this algorithm. Firstly, we present the use of the Laplacian error estimate as an alternative evaluation function and secondly, ..."
Abstract

Cited by 381 (2 self)
 Add to MetaCart
The CN2 algorithm induces an ordered list of classification rules from examples using entropy as its search heuristic. In this short paper, we describe two improvements to this algorithm. Firstly, we present the use of the Laplacian error estimate as an alternative evaluation function and secondly, we show how unordered as well as ordered rules can be generated. We experimentally demonstrate significantly improved performances resulting from these changes, thus enhancing the usefulness of CN2 as an inductive tool. Comparisons with Quinlan's C4.5 are also made. Keywords: learning, rule induction, CN2, Laplace, noise 1 Introduction Rule induction from examples has established itself as a basic component of many machine learning systems, and has been the first ML technology to deliver commercially successful applications (eg. the systems GASOIL [Slocombe et al., 1986], BMT [HayesMichie, 1990], and in process control [Leech, 1986]). The continuing development of inductive techniques is t...
Efficient noisetolerant learning from statistical queries
 JOURNAL OF THE ACM
, 1998
"... In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the class of “robust” learning algorithms in the most general way, we formalize a new but related model of learning from stat ..."
Abstract

Cited by 357 (5 self)
 Add to MetaCart
(Show Context)
In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the class of “robust” learning algorithms in the most general way, we formalize a new but related model of learning from statistical queries. Intuitively, in this model, a learning algorithm is forbidden to examine individual examples of the unknown target function, but is given access to an oracle providing estimates of probabilities over the sample space of random examples. One of our main results shows that any class of functions learnable from statistical queries is in fact learnable with classification noise in Valiant’s model, with a noise rate approaching the informationtheoretic barrier of 1/2. We then demonstrate the generality of the statistical query model, showing that practically every class learnable in Valiant’s model and its variants can also be learned in the new model (and thus can be learned in the presence of noise). A notable exception to this statement is the class of parity functions, which we prove is not learnable from statistical queries, and for which no noisetolerant algorithm is known.
Operations for Learning with Graphical Models
 Journal of Artificial Intelligence Research
, 1994
"... This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models ..."
Abstract

Cited by 277 (13 self)
 Add to MetaCart
(Show Context)
This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models are extended to model data analysis and empirical learning using the notation of plates. Graphical operations for simplifying and manipulating a problem are provided including decomposition, differentiation, and the manipulation of probability models from the exponential family. Two standard algorithm schemas for learning are reviewed in a graphical framework: Gibbs sampling and the expectation maximization algorithm. Using these operations and schemas, some popular algorithms can be synthesized from their graphical specification. This includes versions of linear regression, techniques for feedforward networks, and learning Gaussian and discrete Bayesian networks from data. The paper conclu...
Exploiting structure in policy construction
 IJCAI95, pp.1104–1111
, 1995
"... Markov decision processes (MDPs) have recently been applied to the problem of modeling decisiontheoretic planning. While traditional methods for solving MDPs are often practical for small states spaces, their effectiveness for large AI planning problems is questionable. We present an algorithm, call ..."
Abstract

Cited by 253 (24 self)
 Add to MetaCart
Markov decision processes (MDPs) have recently been applied to the problem of modeling decisiontheoretic planning. While traditional methods for solving MDPs are often practical for small states spaces, their effectiveness for large AI planning problems is questionable. We present an algorithm, called structured policy iteration (SPI), that constructs optimal policies without explicit enumeration of the state space. The algorithm retains the fundamental computational steps of the commonly used modified policy iteration algorithm, but exploitsthe variable and propositionalindependencies reflected in a temporal Bayesian network representation of MDPs. The principles behind SPI can be applied to any structured representation of stochastic actions, policies and value functions, and the algorithm itself can be used in conjunction with recent approximation methods. 1