Results 1  10
of
715
Learning Stochastic Logic Programs
, 2000
"... Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic contextfree grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a firstorder r ..."
Abstract

Cited by 1193 (81 self)
 Add to MetaCart
Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic contextfree grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a firstorder rangerestricted definite clause. This paper summarises the syntax, distributional semantics and proof techniques for SLPs and then discusses how a standard Inductive Logic Programming (ILP) system, Progol, has been modied to support learning of SLPs. The resulting system 1) nds an SLP with uniform probability labels on each definition and nearmaximal Bayes posterior probability and then 2) alters the probability labels to further increase the posterior probability. Stage 1) is implemented within CProgol4.5, which differs from previous versions of Progol by allowing userdefined evaluation functions written in Prolog. It is shown that maximising the Bayesian posterior function involves nding SLPs with short derivations of the examples. Search pruning with the Bayesian evaluation function is carried out in the same way as in previous versions of CProgol. The system is demonstrated with worked examples involving the learning of probability distributions over sequences as well as the learning of simple forms of uncertain knowledge.
BottomUp Relational Learning of Pattern Matching Rules for Information Extraction
, 2003
"... Information extraction is a form of shallow text processing that locates a specified set of relevant items in a naturallanguage document. Systems for this task require significant domainspecific knowledge and are timeconsuming and difficult to build by hand, making them a good application for ..."
Abstract

Cited by 406 (20 self)
 Add to MetaCart
(Show Context)
Information extraction is a form of shallow text processing that locates a specified set of relevant items in a naturallanguage document. Systems for this task require significant domainspecific knowledge and are timeconsuming and difficult to build by hand, making them a good application for machine learning. We present an algorithm, RAPIER, that uses pairs of sample documents and filled templates to induce patternmatch rules that directly extract fillers for the slots in the template. RAPIER is a bottomup learning algorithm that incorporates techniques from several inductive logic programming systems. We have implemented the algorithm in a system that allows patterns to have constraints on the words, partofspeech tags, and semantic classes present in the filler and the surrounding text. We present encouraging experimental results on two domains.
Learning Trees and Rules with Setvalued Features
, 1996
"... In most learning systems examples are represented as fixedlength "feature vectors", the components of which are either real numbers or nominal values. We propose an extension of the featurevector representation that allows the value of a feature to be a set of strings; for instance, to re ..."
Abstract

Cited by 210 (2 self)
 Add to MetaCart
(Show Context)
In most learning systems examples are represented as fixedlength "feature vectors", the components of which are either real numbers or nominal values. We propose an extension of the featurevector representation that allows the value of a feature to be a set of strings; for instance, to represent a small white and black dog with the nominal features size and species and the setvalued feature color, one might use a feature vector with size=small, species=canisfamiliaris and color=fwhite,blackg. Since we make no assumptions about the number of possible set elements, this extension of the traditional featurevector representation is closely connected to Blum's "infinite attribute" representation. We argue that many decision tree and rule learning algorithms can be easily extended to setvalued features. We also show by example that many realworld learning problems can be efficiently and naturally represented with setvalued features; in particular, text categorization problems and probl...
Clausal Discovery
, 1997
"... The clausal discovery engine Claudien is presented. Claudien is an inductive logic programming engine that fits in the descriptive data mining paradigm. Claudien addresses characteristic induction from interpretations, a task which is related to existing formalisations of induction in logic. In ch ..."
Abstract

Cited by 199 (34 self)
 Add to MetaCart
The clausal discovery engine Claudien is presented. Claudien is an inductive logic programming engine that fits in the descriptive data mining paradigm. Claudien addresses characteristic induction from interpretations, a task which is related to existing formalisations of induction in logic. In characteristic induction from interpretations, the regularities are represented by clausal theories, and the data using Herbrand interpretations. Because Claudien uses clausal logic to represent hypotheses, the regularities induced typically involve multiple relations or predicates. Claudien also employs a novel declarative bias mechanism to define the set of clauses that may appear in a hypothesis.
An Algorithm for MultiRelational Discovery of Subgroups
, 1997
"... We consider the problem of finding statistically unusual subgroups in a multirelation database, and extend previous work on singlerelation subgroup discovery. We give a precise definition of the multirelation subgroup discovery task, propose a specific form of declarative bias based on foreign ..."
Abstract

Cited by 195 (8 self)
 Add to MetaCart
We consider the problem of finding statistically unusual subgroups in a multirelation database, and extend previous work on singlerelation subgroup discovery. We give a precise definition of the multirelation subgroup discovery task, propose a specific form of declarative bias based on foreign links as a means of specifying the hypothesis space, and show how propositional evaluation functions can be adapted to the multirelation setting. We then describe an algorithm for this problem setting that uses optimistic estimate and minimal support pruning, an optimal refinement operator and sampling to ensure efficiency and can easily be parallelized.
Separateandconquer rule learning
 Artificial Intelligence Review
, 1999
"... This paper is a survey of inductive rule learning algorithms that use a separateandconquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of ..."
Abstract

Cited by 168 (29 self)
 Add to MetaCart
This paper is a survey of inductive rule learning algorithms that use a separateandconquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases.
Theories for Mutagenicity: A Study in FirstOrder and FeatureBased Induction
 Artificial Intelligence
, 1996
"... A classic problem from chemistry is used to test a conjecture that in domains for which data are most naturally represented by graphs, theories constructed with Inductive Logic Programming (ILP) will significantly outperform those using simpler featurebased methods. One area that has long been asso ..."
Abstract

Cited by 159 (30 self)
 Add to MetaCart
A classic problem from chemistry is used to test a conjecture that in domains for which data are most naturally represented by graphs, theories constructed with Inductive Logic Programming (ILP) will significantly outperform those using simpler featurebased methods. One area that has long been associated with graphbased or structural representation and reasoning is organic chemistry. In this field, we consider the problem of predicting the mutagenic activity of small molecules: a property that is related to carcinogenicity, and an important consideration in developing less hazardous drugs. By providing an ILP system with progressively more structural information concerning the molecules, we compare the predictive power of the logical theories constructed against benchmarks set by regression, neural, and treebased methods. 1 Introduction Constructing theories to explain observations occupies much of the creative hours of scientists and engineers. Programs from the field of Inductiv...
Frequent SubStructureBased Approaches for Classifying Chemical Compounds
 In Proceedings of ICDM’03
, 2003
"... In this paper we study the problem of classifying chemical compound datasets. We present a substructurebased classification algorithm that decouples the substructure discovery process from the classification model construction and uses frequent subgraph discovery algorithms to find all topologi ..."
Abstract

Cited by 140 (6 self)
 Add to MetaCart
In this paper we study the problem of classifying chemical compound datasets. We present a substructurebased classification algorithm that decouples the substructure discovery process from the classification model construction and uses frequent subgraph discovery algorithms to find all topological and geometric substructures present in the dataset. The advantage of our approach is that during classification model construction, all relevant substructures are available allowing the classifier to intelligently select the most discriminating ones. The computational scalability is ensured by the use of highly efficient frequent subgraph discovery algorithms coupled with aggressive feature selection. Our experimental evaluation on eight different classification problems shows that our approach is computationally scalable and outperforms existing schemes by 10% to 35%, on the average.
Lifted firstorder probabilistic inference
 In Proceedings of IJCAI05, 19th International Joint Conference on Artificial Intelligence
, 2005
"... Most probabilistic inference algorithms are specified and processed on a propositional level. In the last decade, many proposals for algorithms accepting firstorder specifications have been presented, but in the inference stage they still operate on a mostly propositional representation level. [Poo ..."
Abstract

Cited by 126 (8 self)
 Add to MetaCart
Most probabilistic inference algorithms are specified and processed on a propositional level. In the last decade, many proposals for algorithms accepting firstorder specifications have been presented, but in the inference stage they still operate on a mostly propositional representation level. [Poole, 2003] presented a method to perform inference directly on the firstorder level, but this method is limited to special cases. In this paper we present the first exact inference algorithm that operates directly on a firstorder level, and that can be applied to any firstorder model (specified in a language that generalizes undirected graphical models). Our experiments show superior performance in comparison with propositional exact inference. 1
An efficient algorithm for discovering frequent subgraphs
 IEEE Transactions on Knowledge and Data Engineering
, 2002
"... Abstract — Over the years, frequent itemset discovery algorithms have been used to find interesting patterns in various application areas. However, as data mining techniques are being increasingly applied to nontraditional domains, existing frequent pattern discovery approach cannot be used. This i ..."
Abstract

Cited by 120 (7 self)
 Add to MetaCart
(Show Context)
Abstract — Over the years, frequent itemset discovery algorithms have been used to find interesting patterns in various application areas. However, as data mining techniques are being increasingly applied to nontraditional domains, existing frequent pattern discovery approach cannot be used. This is because the transaction framework that is assumed by these algorithms cannot be used to effectively model the datasets in these domains. An alternate way of modeling the objects in these datasets is to represent them using graphs. Within that model, one way of formulating the frequent pattern discovery problem is as that of discovering subgraphs that occur frequently over the entire set of graphs. In this paper we present a computationally efficient algorithm, called FSG, for finding all frequent subgraphs in large graph datasets. We experimentally evaluate the performance of FSG using a variety of real and synthetic datasets. Our results show that despite the underlying complexity associated with frequent subgraph discovery, FSG is effective in finding all frequently occurring subgraphs in datasets containing over 200,000 graph transactions and scales linearly with respect to the size of the dataset. Index Terms — Data mining, scientific datasets, frequent pattern discovery, chemical compound datasets.