Results 1 - 10
of
63
Pharmacophore Discovery using the Inductive Logic Programming System Progol
- Machine Learning
, 1998
"... This paper is a case study of a machine aided knowledge discovery process within the general area of drug design. More specifically, the paper describes a sequence of experiments in which an Inductive Logic Programming(ILP) system is used for pharmacophore discovery. Within drug design, a pharmacoph ..."
Abstract
-
Cited by 48 (13 self)
- Add to MetaCart
This paper is a case study of a machine aided knowledge discovery process within the general area of drug design. More specifically, the paper describes a sequence of experiments in which an Inductive Logic Programming(ILP) system is used for pharmacophore discovery. Within drug design, a pharmacophore is a description of the substructure of a ligand (a small molecule) which is responsible for medicinal activity. This medicinal activity is produced by interaction between the ligand and a binding site on a target protein. ILP was chosen by the domain expert (first author) at Pfizer since active molecules are most naturally described, in relational terms, as requiring a substructure (pharmacophore) with various 3-D relations which hold among the atoms involved. The results described in this paper build on previous investigations into prediction of mutagenicity using ILP with a 2-D (bond connectivity only) representation of molecules. The case study supports general lessons for knowledge discovery, as well as more specific lessons for pharmacophorediscovery, the use of ILP for 3-D problems, and for the particular medicinal activity of ACE inhibition, a treatment for hypertension.
Building text classifiers using positive and unlabeled examples
- In: Intl. Conf. on Data Mining
, 2003
"... This paper studies the problem of building text classifiers using positive and unlabeled examples. The key feature of this problem is that there is no negative example for learning. Recently, a few techniques for solving this problem were proposed in the literature. These techniques are based on the ..."
Abstract
-
Cited by 46 (8 self)
- Add to MetaCart
This paper studies the problem of building text classifiers using positive and unlabeled examples. The key feature of this problem is that there is no negative example for learning. Recently, a few techniques for solving this problem were proposed in the literature. These techniques are based on the same idea, which builds a classifier in two steps. Each existing technique uses a different method for each step. In this paper, we first introduce some new methods for the two steps, and perform a comprehensive evaluation of all possible combinations of methods of the two steps. We then propose a more principled approach to solving the problem based on a biased formulation of SVM, and show experimentally that it is more accurate than the existing techniques. 1.
Learning to classify text using positive and unlabeled data
- In: Proceedings of the 19th international joint conference on artificial intelligence
, 2003
"... In traditional text classification, a classifier is built using labeled training documents of every class. This paper studies a different problem. Given a set P of documents of a particular class (called positive class) and a set U of unlabeled documents that contains documents from class P and also ..."
Abstract
-
Cited by 42 (9 self)
- Add to MetaCart
In traditional text classification, a classifier is built using labeled training documents of every class. This paper studies a different problem. Given a set P of documents of a particular class (called positive class) and a set U of unlabeled documents that contains documents from class P and also other types of documents (called negative class documents), we want to build a classifier to classify the documents in U into documents from P and documents not from P. The key feature of this problem is that there is no labeled negative document, which makes traditional text classification techniques inapplicable. In this paper, we propose an effective technique to solve the problem. It combines the Rocchio method and the SVM technique for classifier building. Experimental results show that the new method outperforms existing methods significantly. 1
Loglinear Models for First-Order Probabilistic Reasoning
- In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence
, 1999
"... Recent work on loglinear models in probabilistic constraint logic programming is applied to first-order probabilistic reasoning. Probabilities are defined directly on the proofs of atomic formulae, and by marginalisation on the atomic formulae themselves. We use Stochastic Logic Programs (SLPs) com ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Recent work on loglinear models in probabilistic constraint logic programming is applied to first-order probabilistic reasoning. Probabilities are defined directly on the proofs of atomic formulae, and by marginalisation on the atomic formulae themselves. We use Stochastic Logic Programs (SLPs) composed of labelled and unlabelled definite clauses to define the proof probabilities. We have a conservative extension of first-order reasoning, so that, for example, there is a one-one mapping between logical and random variables. We show how, in this framework, Inductive Logic Programming (ILP) can be used to induce the features of a loglinear model from data. We also compare the presented framework with other approaches to first-order probabilistic reasoning. Keywords: loglinear models, constraint logic programming, inductive logic programming 1 Introduction A framework which merges first-order logical and probabilistic inference in a theoretically sound and applicable manner promises ma...
A study of two sampling methods for analysing large datasets with ILP
, 1999
"... . This paper is concerned with problems that arise when submitting large quantities of data to analysis by an Inductive Logic Programming (ILP) system. Complexity arguments usually make it prohibitive to analyse such datasets in their entirety. We examine two schemes that allow an ILP system to cons ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
. This paper is concerned with problems that arise when submitting large quantities of data to analysis by an Inductive Logic Programming (ILP) system. Complexity arguments usually make it prohibitive to analyse such datasets in their entirety. We examine two schemes that allow an ILP system to construct theories by sampling from this large pool of data. The first, "subsampling", is a single-sample design in which the utility of a potential rule is evaluated on a randomly selected sub-sample of the data. The second, "logical windowing", is multiplesample design that tests and sequentially includes errors made by a partially correct theory. Both schemes are derived from techniques developed to enable propositional learning methods (like decision trees) to cope with large datasets. The ILP system CProgol, equipped with each of these methods, is used to construct theories for two datasets -- one artificial (a chess endgame) and the other naturally occurring (a language tagging problem). I...
Theory-based causal induction
- In
, 2003
"... Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various s ..."
Abstract
-
Cited by 23 (13 self)
- Add to MetaCart
Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various settings, from diverse forms of data: observations of the co-occurrence frequencies between causes and effects, interactions between physical objects, or patterns of spatial or temporal coincidence. These different modes of learning are typically thought of as distinct psychological processes and are rarely studied together, but at heart they present the same inductive challenge—identifying the unobservable mechanisms that generate observable relations between variables, objects, or events, given only sparse and limited data. We present a computational-level analysis of this inductive problem and a framework for its solution, which allows us to model all these forms of causal learning in a common language. In this framework, causal induction is the product of domain-general statistical inference guided by domain-specific prior knowledge, in the form of an abstract causal theory. We identify 3 key aspects of abstract prior knowledge—the ontology of entities, properties, and relations that organizes a domain; the plausibility of specific causal relationships; and the functional form of those relationships—and show how they provide the constraints that people need to induce useful causal models from sparse data.
CProgol4.4: a tutorial introduction
- In Inductive Logic Programming and Knowledge Discovery in Databases
, 2001
"... . This chapter describes the theory and use of CProgol4.4, a state-of-the-art Inductive Logic Programming (ILP) system. After explaining how to download the source code, the reader is guided through the development of Progol input les containing type denitions, mode declarations, background know ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
. This chapter describes the theory and use of CProgol4.4, a state-of-the-art Inductive Logic Programming (ILP) system. After explaining how to download the source code, the reader is guided through the development of Progol input les containing type denitions, mode declarations, background knowledge, examples and integrity constraints. The theory behind the system is then described using a simple example as illustration. The main algorithms in Progol are given and methods of pruning the search space of possible hypotheses are discussed. Next the application of built-in procedures for estimating predictive accuracy and statistical signicance of Progol hypotheses is demonstrated. Lastly, the reader is shown how to use the more advanced features of CProgol4.4, including positive-only learning and the use of metalogical predicates for pruning the search space. 1 Introduction The theory and implementation of the Inductive Logic Programming (ILP) system CProgol4.1 was rst ...
Learning with Positive and Unlabeled Examples Using Weighted Logistic Regression
- Proceedings of the Twentieth International Conference on Machine Learning (ICML
, 2003
"... The problem of learning with positive and unlabeled examples arises frequently in retrieval applications. ..."
Abstract
-
Cited by 20 (6 self)
- Add to MetaCart
The problem of learning with positive and unlabeled examples arises frequently in retrieval applications.
How Do We Evaluate Artificial Immune Systems
- Evolutionary Computation
, 2005
"... The field of Artificial Immune Systems (AIS) concerns the study and development of computationally interesting abstractions of the immune system. This survey tracks the development of AIS since its inception, and then attempts to make an assessment of its usefulness, defined in terms of ‘distinctive ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
The field of Artificial Immune Systems (AIS) concerns the study and development of computationally interesting abstractions of the immune system. This survey tracks the development of AIS since its inception, and then attempts to make an assessment of its usefulness, defined in terms of ‘distinctiveness ’ and ‘effectiveness. ’ In this paper, the standard types of AIS are examined—Negative Selection, Clonal Selection and Immune Networks—as well as a new breed of AIS, based on the immunological ‘danger theory. ’ The paper concludes that all types of AIS largely satisfy the criteria outlined for being useful, but only two types of AIS satisfy both criteria with any certainty.
The Acquisition of Grammar in an Evolving Population of Language Agents
- of Art. Intelligence (Special Issue: Machine Intelligence
, 1999
"... Human language acquisition, and in particular the acquisition of grammar, is a partially-canalized, strongly-biased but robust and e cient procedure. For example, children prefer to induce lexically compositional rules (e.g. Wanner and Gleitman, 1982) despite the use, in every attested human languag ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Human language acquisition, and in particular the acquisition of grammar, is a partially-canalized, strongly-biased but robust and e cient procedure. For example, children prefer to induce lexically compositional rules (e.g. Wanner and Gleitman, 1982) despite the use, in every attested human language, of constructions, such as morphological negation or non-compositional idioms. And, most parameters of grammatical variation set during language acquisition appear to have default or so-called unmarked values retained in the absence of robust counter-evidence (e.g. Bickerton, 1984 � Hyams, 1986 � Lightfoot, 1992). A variety of explanations have been o ered for the emergence of a partially-innate language acquisition device (LAD) with such properties based on saltation (Berwick, 1998 � Bickerton, 1990, 1998) or genetic assimilation (Pinker and Bloom, 1990). But none provide a coherent detailed account of both the emergence and maintenance of a LAD in an evolving population. The account proposed here is that a minimal LAD emerged via recruitment of general-purpose (Bayesian) learning mechanisms (e.g. Staddon, 1988 � Cosmides and Tooby, 1996) to a speci cally-linguistic mental representation capable of expressing mappings from the `language of thought ' to realizable, essentially linearized, encodings of propositions of the language of thought. However, the selective pressure favouring such adevelopment, and its subsequent maintenance and re nement, is only coherent given a coevolutionary scenario in which a (proto)language supporting successful communication within a population had already itself evolved on a historical timescale (e.g. Hurford, 1987 � Kirby, 1998 � Steels, 1998) and continued to coevolve with the LAD (e.g. Briscoe, 1997, 1998a,b). The model of the LAD presented here builds on and extends previous work in the parameter setting

