Results 1  10
of
10
The subsumption lattice and query learning
 Proceedings of the International Conference on Algorithmic Learning Theory
, 2004
"... The paper identifies several new properties of the lattice induced by the subsumption relation over firstorder clauses and derives implications of these for learnability. In particular, it is shown that the length of subsumption chains of function free clauses with bounded size can be exponential i ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
The paper identifies several new properties of the lattice induced by the subsumption relation over firstorder clauses and derives implications of these for learnability. In particular, it is shown that the length of subsumption chains of function free clauses with bounded size can be exponential in the size. This suggests that simple algorithmic approaches that rely on repeating minimal subsumptionbased refinements may require a long time to converge. It is also shown that with bounded size clauses the subsumption lattice has a large branching factor. This is used to show that the class of firstorder lengthbounded monotone clauses is not properly learnable from membership queries alone. Finally, the paper studies pairing, a generalization operation that takes two clauses and returns a number of possible generalizations. It is shown that there are clauses with an exponential number of pairing results which are not related to each other by subsumption. This is used to show that recent pairingbased algorithms can make exponentially many queries on some learning problems. 1
Construction and learnability of canonical Horn formulas
"... We describe an alternative construction of an existing canonical representation for definite Horn theories, the GuiguesDuquenne basis (or GD basis), which minimizes a natural notion of implicational size. We extend the canonical representation to general Horn, by providing a reduction from definit ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We describe an alternative construction of an existing canonical representation for definite Horn theories, the GuiguesDuquenne basis (or GD basis), which minimizes a natural notion of implicational size. We extend the canonical representation to general Horn, by providing a reduction from definite to general Horn CNF. Using these tools, we provide a new, simpler validation of the classic Horn query learning algorithm of Angluin, Frazier, and Pitt, and we prove the surprising result that this algorithm always outputs the GD basis regardless of the counterexamples it receives.
Query Learning and Certificates in Lattices
"... Abstract. We provide an abstract version, in terms of lattices, of the Horn query learning algorithm of Angluin, Frazier, and Pitt. To validate it, we develop a proof that is independent of the propositional Horn logic structure. We also construct a certificate set for the class of lattices that gen ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We provide an abstract version, in terms of lattices, of the Horn query learning algorithm of Angluin, Frazier, and Pitt. To validate it, we develop a proof that is independent of the propositional Horn logic structure. We also construct a certificate set for the class of lattices that generalizes and improves an earlier certificate construction and that relates very clearly with the new proof. 1
Distributionfree Bounds for Relational Classification
"... Statistical Relational Learning (SRL) is a subarea in Machine Learning which addresses the problem of performing statistical inference on data that is correlated and not independently and identically distributed (i.i.d.)  as is generally assumed. For the traditional i.i.d. setting, distribution ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Statistical Relational Learning (SRL) is a subarea in Machine Learning which addresses the problem of performing statistical inference on data that is correlated and not independently and identically distributed (i.i.d.)  as is generally assumed. For the traditional i.i.d. setting, distribution free bounds exist, such as the Hoeffding bound, which are used to provide confidence bounds on the generalization error of a classification algorithm given its holdout error on a sample size of N. Bounds of this form are currently not present for the type of interactions that are considered in the data by relational classification algorithms. In this paper we extend the Hoeffding bounds to the relational setting. In particular, we derive distribution free bounds for certain classes of data generation models that do not produce i.i.d. data and are based on the type of interactions that are considered by relational classification algorithms that have been developed in SRL. We conduct empirical studies on synthetic and real data which show that these data generation models are indeed realistic and the derived bounds are tight enough for practical use.
Complexity parameters of first order classes
 In Proceedings of the 13th International Conference on Inductive Logic Programming
, 2003
"... Abstract. We study several complexity parameters for first order formulas and their suitability for first order learning models. We show that the standard notion of size is not captured by sets of parameters that are used in the literature and thus they cannot give a complete characterization in ter ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. We study several complexity parameters for first order formulas and their suitability for first order learning models. We show that the standard notion of size is not captured by sets of parameters that are used in the literature and thus they cannot give a complete characterization in terms of learnability with polynomial resources. We then identify an alternative notion of size and a simple set of parameters that are useful in this sense. Matching lower bounds derived using the Vapnik Chervonenkis dimension complete the picture showing that these parameters are indeed crucial. 1
Canonical Horn Representations and Query Learning
"... We describe an alternative construction of an existing canonical representation for definite Horn theories, the GuiguesDuquenne basis (or GD basis), which minimizes a natural notion of implicational size. We extend the canonical representation to general Horn, by providing a reduction from definite ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We describe an alternative construction of an existing canonical representation for definite Horn theories, the GuiguesDuquenne basis (or GD basis), which minimizes a natural notion of implicational size. We extend the canonical representation to general Horn, by providing a reduction from definite to general Horn CNF. We show how this representation relates to two topics in query learning theory: first, we show that a wellknown algorithm by Angluin, Frazier and Pitt that learns Horn CNF always outputs the GD basis independently of the counterexamples it receives; second, we build strong polynomial certificates for Horn CNF directly from the GD basis.
Polynomial Certificates for Propositional Classes ⋆ Abstract
"... This paper studies the complexity of learning classes of expressions in propositional logic from equivalence queries and membership queries. In particular, we focus on bounding the number of queries that are required to learn the class ignoring computational complexity. This quantity is known to be ..."
Abstract
 Add to MetaCart
(Show Context)
This paper studies the complexity of learning classes of expressions in propositional logic from equivalence queries and membership queries. In particular, we focus on bounding the number of queries that are required to learn the class ignoring computational complexity. This quantity is known to be captured by a combinatorial measure of concept classes known as the certificate complexity. The paper gives new constructions of polynomial size certificates for monotone expressions in conjunctive normal form (CNF), for unate CNF functions where each variable affects the function either positively or negatively but not both ways, and for Horn CNF functions. Lower bounds on certificate size for these classes are derived showing that for some parameter settings the new certificate constructions are optimal. Finally, the paper gives an exponential lower bound on the certificate size for a natural generalization of these classes known as renamable Horn CNF functions, thus implying that the class is not learnable from a polynomial number of queries. Preprint submitted to Information and Computation 2 March 2006 1
Test Set Bounds for Relational Data that Vary with Strength of Dependence
"... A large portion of the data that is collected in various application domains such as online social networking, finance, biomedicine, etc. is relational in nature. A subfield of Machine Learning namely; Statistical Relational Learning (SRL) is concerned with performing statistical inference on relati ..."
Abstract
 Add to MetaCart
A large portion of the data that is collected in various application domains such as online social networking, finance, biomedicine, etc. is relational in nature. A subfield of Machine Learning namely; Statistical Relational Learning (SRL) is concerned with performing statistical inference on relational data. A defining property of relational data that separates it from independently and identically distributed data (i.i.d.) is the existence of correlations between individual datapoints. A major portion of the theory developed in machine learning assumes the data is i.i.d. In this paper we develop theory for the relational setting. In particular, we derive distributionfree bounds on the generalization error of a classifier for the relational setting, where the class of data generation models we consider are inspired from the type joint distributions that are represented by relational classification models developed by the SRL community. A key aspect of the bound we derive is that the tightness of the bound is a function of the strength of dependence between related datapoints, with the bound reducing to the standard Hoeffding’s or McDiarmid’s inequality when there is no dependence. To the best of our knowledge this is the first bound for relational data whose tightness varies with the strength of dependence. Moreover, the bound provides insight in the computation of effective sample size which is an important notion introduced by Jensen and Neville (2002).
Certificates of NonMembership for Classes of ReadOnce Functions
"... Abstract. A certificate of nonmembership for a Boolean function f with respect to a class C, f ̸ ∈ C, is a set S of input strings such that the values of f on strings from S are inconsistent with any function h ∈ C. We study certificates of nonmembership with respect to several classes of readonc ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. A certificate of nonmembership for a Boolean function f with respect to a class C, f ̸ ∈ C, is a set S of input strings such that the values of f on strings from S are inconsistent with any function h ∈ C. We study certificates of nonmembership with respect to several classes of readonce functions, generated by their bases. For the basis {&, ∨, ¬}, we determine the optimal certificate size for every function outside the class and deduce that 6 strings always suffice. For the same basis augmented with a function x1... xs ∨ x1... xs, we show that there exist nvariable functions requiring Ω(n s−1) strings in a certificate as n → ∞. For s = 2, we show that this bound is tight by constructing certificates of size O(n) for all functions outside the class.