Results 1  10
of
99
Learning Stochastic Logic Programs
, 2000
"... Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic contextfree grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a firstorder r ..."
Abstract

Cited by 1181 (79 self)
 Add to MetaCart
(Show Context)
Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic contextfree grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a firstorder rangerestricted definite clause. This paper summarises the syntax, distributional semantics and proof techniques for SLPs and then discusses how a standard Inductive Logic Programming (ILP) system, Progol, has been modied to support learning of SLPs. The resulting system 1) nds an SLP with uniform probability labels on each definition and nearmaximal Bayes posterior probability and then 2) alters the probability labels to further increase the posterior probability. Stage 1) is implemented within CProgol4.5, which differs from previous versions of Progol by allowing userdefined evaluation functions written in Prolog. It is shown that maximising the Bayesian posterior function involves nding SLPs with short derivations of the examples. Search pruning with the Bayesian evaluation function is carried out in the same way as in previous versions of CProgol. The system is demonstrated with worked examples involving the learning of probability distributions over sequences as well as the learning of simple forms of uncertain knowledge.
Learning Trees and Rules with Setvalued Features
, 1996
"... In most learning systems examples are represented as fixedlength "feature vectors", the components of which are either real numbers or nominal values. We propose an extension of the featurevector representation that allows the value of a feature to be a set of strings; for instance, to re ..."
Abstract

Cited by 209 (2 self)
 Add to MetaCart
(Show Context)
In most learning systems examples are represented as fixedlength "feature vectors", the components of which are either real numbers or nominal values. We propose an extension of the featurevector representation that allows the value of a feature to be a set of strings; for instance, to represent a small white and black dog with the nominal features size and species and the setvalued feature color, one might use a feature vector with size=small, species=canisfamiliaris and color=fwhite,blackg. Since we make no assumptions about the number of possible set elements, this extension of the traditional featurevector representation is closely connected to Blum's "infinite attribute" representation. We argue that many decision tree and rule learning algorithms can be easily extended to setvalued features. We also show by example that many realworld learning problems can be efficiently and naturally represented with setvalued features; in particular, text categorization problems and probl...
Computing Least Common Subsumers in Description Logics
 PROCEEDINGS OF THE 10TH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 1992
"... Description logics are a popular formalism for knowledge representation and reasoning. This paper introduces a new operation for description logics: computing the "least common subsumer" of a pair of descriptions. This operation computes the largest set of commonalities between two descrip ..."
Abstract

Cited by 107 (14 self)
 Add to MetaCart
Description logics are a popular formalism for knowledge representation and reasoning. This paper introduces a new operation for description logics: computing the "least common subsumer" of a pair of descriptions. This operation computes the largest set of commonalities between two descriptions. After arguing for the usefulness of this operation, we analyze it by relating computation of the least common subsumer to the wellunderstood problem of testing subsumption; a close connection is shown in the restricted case of "structural subsumption". We also present a method for computing the least common subsumer of "attribute chain equalities", and analyze the tractability of computing the least common subsumer of a set of descriptionsan important operation in inductive learning.
Controlling the Complexity of Learning in Logic through Syntactic and TaskOriented Models
 INDUCTIVE LOGIC PROGRAMMING
, 1992
"... Due to the inadequacy of attributeonly representations for many learning problems, there is now a renewed interest in algorithms employing firstorder logic or restricted variants thereof as their knowledge representation. In this paper, we give a brief overview of the dimensions along which the ..."
Abstract

Cited by 100 (7 self)
 Add to MetaCart
Due to the inadequacy of attributeonly representations for many learning problems, there is now a renewed interest in algorithms employing firstorder logic or restricted variants thereof as their knowledge representation. In this paper, we give a brief overview of the dimensions along which the complexity of learning in such representations can be controlled. We then present RDT, a modelbased learning algorithm for functionfree Horn clauses with negation that introduces two new means of complexity control, namely the use of syntactic rule models, and the use of a taskoriented domain topology. We briefly describe some preliminary application results of RDT within the knowledge acquisition system MOBAL, and present directions of further research.
Multiple Viewpoint Systems for Music Prediction
 JOURNAL OF NEW MUSIC RESEARCH
, 1995
"... This paper examines the prediction and generation of music using a multiple viewpoint system, a collection of independent views of the musical surface each of which models a specific type of musical phenomena. Both the general style and a particular piece are modeled using dual shortterm and longt ..."
Abstract

Cited by 100 (11 self)
 Add to MetaCart
This paper examines the prediction and generation of music using a multiple viewpoint system, a collection of independent views of the musical surface each of which models a specific type of musical phenomena. Both the general style and a particular piece are modeled using dual shortterm and longterm theories, and the model is created using machine learning techniques on a corpus of musical examples. The models are
A Theory of Learning Classification Rules
, 1992
"... The main contributions of this thesis are a Bayesian theory of learning classification rules, the unification and comparison of this theory with some previous theories of learning, and two extensive applications of the theory to the problems of learning class probability trees and bounding error whe ..."
Abstract

Cited by 88 (6 self)
 Add to MetaCart
The main contributions of this thesis are a Bayesian theory of learning classification rules, the unification and comparison of this theory with some previous theories of learning, and two extensive applications of the theory to the problems of learning class probability trees and bounding error when learning logical rules. The thesis is motivated by considering some current research issues in machine learning such as bias, overfitting and search, and considering the requirements placed on a learning system when it is used for knowledge acquisition. Basic Bayesian decision theory relevant to the problem of learning classification rules is reviewed, then a Bayesian framework for such learning is presented. The framework has three components: the hypothesis space, the learning protocol, and criteria for successful learning. Several learning protocols are analysed in detail: queries, logical, noisy, uncertain and positiveonly examples. The analysis is done by interpreting a protocol as a...
A Polynomial Approach to the Constructive Induction of . . .
 MACHINE LEARNING
, 1994
"... The representation formalism as well as the representation language is of great importance for the success of machine learning. The representation formalism should be expressive, efficient, useful, and applicable. Firstorder logic needs to be restricted in order to be efficient for inductive and de ..."
Abstract

Cited by 71 (2 self)
 Add to MetaCart
The representation formalism as well as the representation language is of great importance for the success of machine learning. The representation formalism should be expressive, efficient, useful, and applicable. Firstorder logic needs to be restricted in order to be efficient for inductive and deductive reasoning. In the field of knowledge representation term subsumption formalisms have been developed which are efficient and expressive. In this paper, a learning algorithm, KLUSTER, is described which represents concept definitions in this formalism. KLUSTER enhances the representation language if this is necessary for the discrimination of concepts. Hence, KLUSTER is a constructive induction program. KLUSTER builds the most specific generalization and a most general discrimination in polynomial time. It embeds these concept learning problems into the overall task of learning a hierarchy of concepts.
MultiRelational Data Mining: An Introduction
"... Data mining algorithms look for patterns in data. While most existing data mining approaches look for patterns in a single data table, multirelational data mining (MRDM) approaches look for patterns that involve multiple tables ..."
Abstract

Cited by 61 (0 self)
 Add to MetaCart
Data mining algorithms look for patterns in data. While most existing data mining approaches look for patterns in a single data table, multirelational data mining (MRDM) approaches look for patterns that involve multiple tables
Selecting Among Rules Induced from a Hurricane Database
 Proceedings of AAI93, Workshop on Knowledge Discovery in Databases
, 1993
"... can achieve orders of magnitude reduction in the volume of data For example, we applied a commercial tool (IXLtin) to a 1,819 record tropical storm database, yielding 161 rules. However, the human comprehension goals of Knowledge Discovery in Databases may require still more orders of magnitude. We ..."
Abstract

Cited by 53 (0 self)
 Add to MetaCart
can achieve orders of magnitude reduction in the volume of data For example, we applied a commercial tool (IXLtin) to a 1,819 record tropical storm database, yielding 161 rules. However, the human comprehension goals of Knowledge Discovery in Databases may require still more orders of magnitude. We present a rule refinement strategy, partly implemented in a Prolog program, that operationalizes "interestingness " into performance, simplicity, novelty, and significance. Applying the strategy to the induced rulebase yielded 10 "genuinely interesting " rules. I. PURPOSE OF THE STUDY At The Travelers Insurance Company, we are involved in applying statistics and artificial intelligence techniques to the solution of business problems. This work is part of an investigation into applications for Natural Hazards Research Services. The purpose of this study is not to deyelop a hurricane model or predictor. It is, rather, to assess the utility of rule induction technology and our particular rule refinement strategy. The object task of the study is to develop rules that