| D. Helmbold, R. Sloan and M. K. Warmuth. Learning Nested Differences of Intersection-Closed Concept Classes. Machine Learning, 165--196, 5, 2, 1990. |
....algorithm for DNF. We show that any circuit that has a subexponential DNF or CNF size also has a subexponential learning algorithm from equivalence queries. For polynomial size DNF our algorithm runs in time 2 . Our algorithm is based on learning k decision list from equivalence queries [HSW90]. We show that every n decision list of length s is an O( n log n log s) decision list. Therefore running the O( n log n) decision list learning algorithm will learn any polynomial size DNF. In Section 2 we give the learning model and some definitions. In Section 3 we show that any ....
....start from the root and do the following at each node. If the node is a leaf then we compute the decision list in that leaf and if it is an internal node v we proceed to the left tree if the value of the term in v is 1 and to the right tree if it is 0. 3 2. 3 Previous Results Helmbold et al. in [HSW90] showed that k decision lists are learnable from equivalence queries in time O(k) In [Bl92] Blum proved that any decision tree of size s is a log s decision list. Therefore, there is a learning algorithm for decision trees of size s that runs in time O(log s) 3 Decision List for DNF In ....
D. Helmbold, R. Sloan and M. K. Warmuth. Learning Nested Differences of Intersection-Closed Concept Classes. Machine Learning, 165--196, 5, 2, 1990.
....we show in the following lemma that Phi k DL, and hence k sink width two branching programs, are exactly learnable from equivalence queries. The idea is to use the closure algorithm for learning nested differences of intersection closed concept classes due to Helmbold, Sloan, and Warmuth [HSW90]. Lemma 4.3 The class Phi k DL of decision lists whose nodes are parity of monomials of size at most k is exactly learnable using equivalence queries. Proof By the transformation technique of Littlestone [L88] it suffices to prove the claim for the concept class of decision list with parity ....
....variables for each k subset of the variables and learn the target concept as a new function over at most n n variables. To exactly learn Phi n DL we will appeal to the exact learning algorithm for nested differences of intersection closed concept classes due to Helmbold, Sloan, and Warmuth [HSW90]. For this we will argue that we can express any element of Phi n DL as a nested difference of vector spaces over GF (2) Assume that the target concept f 2 Phi n DL is given by f = a1 ; b 1 ) a2 ; b 2 ) ak ; b k ) where a 1 ; a 2 ; a k 2 f0; 1g and b 1 ; b ....
[Article contains additional citation context not shown here]
David Helmbold, Robert Sloan, and Manfred Warmuth. Learning Nested Differences of Intersection-Closed Concept Classes. In Machine Learning, 5:165--196, 1990.
....functions. We find the following characterizations of C 1 DL . It coincides with C R DH , the renaming closure of the class of functions f such that both f and its complement f are Horn [12] also called disguised double Horn functions) CND , the class of nested differences of concepts [21], where each concept is described by a single term; C 2M CR 1 , the intersection of the classes of 2 monotonic functions [32] and read once functions, i.e. functions definable by a formula in which each variable occurs at most once [18, 25, 39, 37] C TH CR 1 , the intersection of ....
....has a unique prime DNF. Conversely, in (3.2) can be rewritten by factorization to a linear read once formula t 1 (x 1 t 2 (x 2 t 3 (x 3 t n x n ) where empty terms t i are simply omitted. The result thus follows from Theorem 3.4. 2 Nested differences of concepts. In [21], learning issues for concept classes have been studied which satisfy certain properties. In particular, learning of concepts expressed as the nested difference c 1 n (c 2 n ( c k 1 n c k ) of concepts c 1 ; c k has been considered, where the c i are from a concept class which is ....
[Article contains additional citation context not shown here]
D. Helmbold, R. Sloan, and M. Warmuth. Learning Nested Differences of Intersection-Closed Concept Classes. Machine Learning, 5:165--190, 1990.
.... GS H : while EQ(hHi C ) no do Let c be the counterexample H : H [ fcg endwhile return hHi C Figure 1: Generating Set al..gorithm (GS) Consequently, it is possible to extend the results in this paper to nested differences of intersection closed classes using an approach similar to [11]. It is possible to convert algorithm GS to a proper algorithm finding, for every set of generators H, a concept c equivalent to hHiC , but this step can be computationally expensive. Clearly, algorithm GS always finds the target concept. The only drawback is its time complexity, since it can ....
D. Helmbold, R. Sloan, and M. Warmuth. Learning Nested Differences of Intersection-Closed Concept Classes. Machine Learning, 5(2):165--196, 1990.
....with L followed by item (l; fi) or (1; fi) respectively) is still consistent with f . For the time being, let us assume that there is a procedure P which can efficiently compute all legal extensions of a given list L. It is well known that 1 DL n;q is learnable using q calls of P . See [13] or [8], for instance) The proof is now completed as follows. Observe that (l; fi) is a legal extension of L iff f j fi on the subcube given by l 1 l 2 : l i l, which can be tested by 1 UAV MQ. Thus, 2(n Gamma i) UAV MQs are enough to exhaustively test all candidate literals l. Note that (1; fi) ....
David Helmbold, Robert Sloan, and Manfred K. Warmuth. Learning nested differences of intersection-closed concept classes. Machine Learning, 5:165--196, 1990. 5 An affirmative answer to this open problem implies an affirmative answer to the preceding one [7]. 21
....the sample, is efficient. We only require the existence of such a sequence. We shall refer to these two models as the ordered and order independent model. 1. 2 Known results about space bounded learning and the compression parameter Previous works on learning with space restrictions includes [4] [6] and [3] Haussler in [4] was the first to deal with space bounded learning. Haussler presents space efficient learning algorithms for a restricted class of decision trees and for certain finite valued functions on the real line. The space used by these two algorithms is polynomially bounded in ....
....trees and for certain finite valued functions on the real line. The space used by these two algorithms is polynomially bounded in the VC dimension of the concept class. The class of nested differences of intersection closed concept classes has been investigated by Helmbold, Sloan and Warmuth in [6]. They show that for special classes with this property space efficient learning is possible. Littlestone and Warmuth in [8] introduce the notion of a kernel size, which is very similar to what we call order independent compression parameter of a concept class. They show a bound on the sample ....
[Article contains additional citation context not shown here]
David Helmbold, Robert Sloan, and Manfred K. Warmuth. Learning nested differences of intersection-closed concept classes. In Proceedings of the 2nd Annual Workshop on Computational Learning Theory, pages 41--56, 1989.
.... learns any s term DNF over n variables in time 2 O( p n log s log 3=2 n) At the heart of Bshouty s algorithm is a structural result which shows that that any s term DNF can be expressed as an O( p n log n log s) decision list; armed with this result, Bshouty uses a standard algorithm [22] for learning decision lists to obtain his DNF learning result. Tarui and Tsukiji [35] gave a completely different proof of a similar time bound for learning DNF. They adapted the machinery of approximate inclusion exclusion developed by Linial and Nisan [28] to show that for any s term DNF f ....
D. Helmbold, R. Sloan and M. Warmuth. Learning nested differences of intersection-closed concept classes. Machine Learning 5 (1990), 165-196. 11
....possible decision tree. Boucheron and Sallantin [10] show that some classes of boolean functions can be learned time efficiently using only logarithmic (in the number of variables) space. PAC learning while remembering only a fixed number of examples, each of a bounded size is considered in [3, 18, 32]. The most general investigation on this line was the observation in [51] that the boosting algorithm can be made reasonably space efficient as well. Sample complexity gives only a very crude accounting of space utilization. Learning procedures may want to remember other information than just ....
D. Helmbold, R. Sloan, and M. Warmuth. Learning nested differences of intersection-closed concept classes. In R. Rivest, D. Haussler, and M. Warmuth, editors, Proceedings of the 1989 Workshop on Computational Learning Theory, pages 41--56. Morgan Kaufmann, 1989.
....of polynomial size. Since these concepts are trivially polynomially learnable, not much attention has been paid to them. In the past, researchers are concentrated on the learnability of concept classes whose sample spaces are, of course (otherwise the problem would be trivial) super polynomial [31, 3, 25, 12], although efficient sampling was studied for example in [11] On the other hand, our problem has trivial algorithms that need high polynomial number of examples, but it also has non trivial algorithms requiring low polynomial number of examples. It is worth noting that artificial intelligence ....
D. Helmbold, R. Sloan, and M. Warmuth. Learning nested differences of intersection-closed classes. 2nd Workshop on Computational Learning Theory, 41-56, 1989.
....model there is an interesting 5 relationship between teaching and data compression. Applying the results of Floyd [8] we obtain that for any maximum class C there is a teacher learner pair for which at most vcd(C) examples are presented. Likewise, from the results of Helmbold, Sloan and Warmuth [15], it follows that for any intersection closed class C, the nested difference of p functions from C can be taught in our model with at most p Delta vcd(C) examples. It is clear that in this paper we only scratch the surface of such results that follow from previous work of others. In addition to ....
....we get the following corollary. Corollary 10 For any maximum class C, there is a valid T L pair such that the optimal teaching set has length at most the VC dimension of C. We can show the corresponding result for intersection closed classes by applying results from Helmbold, Sloan and Warmuth [15]. They define a spanning set of a representation c 2 C with respect to the class C to be a set I c having the property that c is the unique, most specific representation consistent with the instances in I and show that for intersection closed classes all minimum spanning sets have size at most ....
David Helmbold, Robert Sloan, and Manfred K. Warmuth. Learning nested differences of intersection-closed concept classes. Machine Learning, 5:165--196, 1990. Special issue for COLT 89. 33
....we show in the following lemma that Phi k DL, and hence k sink width two branching programs, are exactly learnable from equivalence queries. The idea is to use the closure algorithm for learning nested differences of intersection closed concept classes due to Helmbold, Sloan, and Warmuth [HSW90]. Lemma 4.3 The class Phi k DL of decision lists whose nodes are parity of monomials of size at most k is exactly learnable using equivalence queries. Proof By the transformation technique of Littlestone [L88] it suffices to prove the claim for the concept class of decision list with parity ....
....variables for each k subset of the variables and learn the target concept as a new function over at most n n k variables. To exactly learn Phi n DL we will appeal to the exact learning algorithm for nested differences of intersection closed concept classes due to Helmbold, Sloan, and Warmuth [HSW90]. For this we will argue that we can express any element of Phi n DL as a nested difference of vector spaces over GF (2) n . Assume that the target concept f 2 Phi n DL is given by f = a1 ; b 1 ) a2 ; b 2 ) ak ; b k ) where a 1 ; a 2 ; a k 2 f0; 1g n and b 1 ....
[Article contains additional citation context not shown here]
David Helmbold, Robert Sloan, and Manfred Warmuth. Learning Nested Differences of Intersection-Closed Concept Classes. In Machine Learning, 5:165--196, 1990.
....of the best component learning algorithm [80] Thus the performance of the master algorithm is almost as good as that of the best component algorithm. This is particularly useful when a good learning algorithm is known but a parameter of the algorithm has to be tuned for the particular application [61]. In this case the weighted majority method is applied to a pool of component algorithms, each of which is a version of the original learning algorithm with a different setting of the parameter. The master algorithm s performance approaches the performance of the component algorithm with the best ....
D. Helmbold, R. Sloan, and M. K. Warmuth. Learning nested differences of intersection closed concept classes. Machine Learning, 5:165--196, 1990.
....sequences. This approach can be compared to the ideas and results contained in [3, 6, 18] where general (nonefficient) conversion strategies to make an on line learning algorithm robust to adversarial noise were proposed. We consider a very general on line strategy known as Closure Algorithm [8, 10, 11, 20] for learning intersection closed concept classes in the noise free model. We extend this strategy to our noisy learning setting and show a worst case mistake bound of m (d 1)K for learning an arbitrary intersection closed concept class C, where K is the number of noisy labels, d is a ....
....problem s parameters. To our knowledge, this is the first example of a quite general and efficient on line strategy for learning in presence of noise. We are currently investigating the extension of our results to the noisy learning of nested differences of intersection closed concept classes (see [10]. An open problem is whether the sample size bounds for converting an on line algorithm to a malicious PAC learning algorithm can be substantially improved without being dependent on the size of the hypothesis class as in Theorem 20. Acknowledgments Nicol o Cesa Bianchi is also with DSI, ....
D.P. Helmbold, R. Sloan, and M.K. Warmuth. Learning nested differences of intersection-closed concept classes. Machine Learning, 5(2):165--196, 1990.
....define a restricted sublanguage which includes not only acyclic concept graphs with a bounded number of vertices, but also all concept descriptions that do not use the SAME AS construct, and several other interesting subsets of CoreClassic. 5. 1 An algorithm for learning from positive examples Helmbold [ 1990 ] and others have noted that in every conjunctively closed language L, there is a canonical algorithm for learning from positive examples only: given a set of positive examples x 1 ; xm , return the most specific concept in L that includes these concepts. In our learning framework, set ....
, 1990.
....learnable, even in the relatively weak model of polynomial predictability, a natural next step is to consider syntactically restricting the language to promote learnability. In this section we will consider restrictions that make a particular learning method tractable. 5. 1 The LCS Algorithm Helmbold [1990] and others have noted that in every conjunctively closed 9 language L, there is a canonical algorithm for PAC learning from positive examples only: given a set of positive examples x 1 ; xm , return the most specific concept in L that includes these concepts. In our learning framework, ....
, 1990.
....addressed in this paper are representation and modularization of knowledge, and support of this process by machine learning algorithms. To improve the comprehensibility of large knowledge bases, a number of representational schemes was developed that provide modularity by supporting exceptions [Ver80, Riv87, KMU93b, HSW89, DK95], the most general for attribute value based representations and most strongly structured scheme being ripple down rules [CJ88] A ripple down rule (RDR) is a list of rules; each rule may be associated to another list of rules: its exceptions. The implied concept of locality (an exception is ....
....RDR learning algorithm is strongly related to the strategy for finding interesting rules. 1. Given a conclusion, find the rule that is least likely to predict that conclusion by chance 2. learn the exceptions recursively (except branch) 3. learn the remaining samples recursively (if not branch) In [HSW89] an algorithm is proposed, that learns nested exceptions by repeatedly covering positive and negative examples, yet the algorithm does not deal with disjunctions of rules in the exception levels; this simplifies the learning problem. In [DK95] an algorithm is presented, that uses a classical rule ....
[Article contains additional citation context not shown here]
D. Helmbold, R. Sloan, and M. K. Warmuth. Learning nested differences of intersection-closed concept classes. Proceedings of the Workshop on Computational Learning Theory, pages 41--56, 1989.
....We find the following characterizations of C 1 DL . It coincides with ffl C R DH , the renaming closure of the class of functions f such that both f and its complement f are Horn [12] also called disguised double Horn functions) ffl CND , the class of nested differences of concepts [20], where each concept is described by a single term; ffl C 2M CR 1 , the intersection of the classes of 2 monotonic functions [29] and read once functions, i.e. functions definable by a formula in which each variable occurs at most once [17, 23, 35, 34] ffl C TH CR 1 , the intersection of ....
...., i = 1; 2; m, are pairwise disjoint positive terms and t i for i = 1; 2; m are possibly empty. In particular, 3.2) implies = if m = 0. Conversely, every such of (3.2) represents an f 2 C R DH (equivalently, an f 2 C 1 DL , f 2 CLR 1 ) 2 Nested differences of concepts. In [20], learning issues for concept classes have been studied which satisfy certain properties. In particular, learning of concepts expressed as the nested difference c 1 n (c 2 n ( Delta Delta Delta (c k Gamma1 n c k ) of concepts c 1 ; c k has been considered, where the c i are from a ....
[Article contains additional citation context not shown here]
D. Helmbold, R. Sloan, and M. Warmuth. Learning Nested Differences of Intersection-Closed Concept Classes. Machine Learning, 5:165--190, 1990.
.... and the Vapnik Chervonenkis dimension grows logarithmically with respect to the unary encoding [HSW92] Helmbold, Sloan and Warmuth show that not only is the class of submodules of Z d encoded in binary efficiently learnable, but so is the class of nested differences of members of this class [HSW90]. In contrast, we show that the prediction problem for the class of finite unions of modules (which we call semimodules ) is also as hard as that for DNF, for fixed dimensions in binary, or variable dimensions in unary. 5 The prediction problem of a representation class is the same as the ....
D. Helmbold, R. Sloan, and M. K. Warmuth. Learning nested differences of intersection closed concept classes. Machine Learning, 5(1), June 1990.
....primarily used to show that certain classes of concepts are not learnable and concerns itself with learnability in the limit. A model related to the PAC model has been proposed to deal more directly with analyzing the worst case error of learning algorithms (Haussler, Littlestone, Warmuth, 1990; Hembold, Sloan, Warmuth, 1990). This model is designed to predict the probability of a classification error on the N 1st example after being trained on N training examples. 1.2 A framework for predicting the mean error rate In this paper we present a framework for analyzing learning algorithms that will yield the expectation ....
Hembold, D., Sloan, R., and Warmuth, M. 1990. Learning nested differences of intersection-closed concept classes, Machine Learning, 5, 165-196.
....m d Delta of them) determine hyperplanes (halfspaces) that yield a useful set of virtual variables. The class of axis parallel boxes is a simple example of an intersection closed concept class, and nested differences of concepts from this class are efficiently learnable in the PAC model [HSW90]. A challenging open problem is to find an on line algorithm for learning nested differences of axis parallel boxes over the discretized domain f1; ng d with at most O(pd log n) mistakes (where p is the depth of the target concept) and time polynomial in p, d, and log n. There is a ....
D. P. Helmbold, R. Sloan and Manfred K. Warmuth. Learning Nested Differences of Intersection-Closed Concept Classes. Machine Learning, vol. 5, pp. 165-196, 1990.
....has as a hypothesis the smallest axis parallel rectangle consistent with these points. This hypothesis is guaranteed to be consistent with the original set of examples. This class is of VC dimension four. 2 Rectangles are one example of an intersection closed concept class. The results of [HSW89] lead to compression schemes of size at most d for any intersection closed class of VC dimension d. Example (intervals on the line) One compression function for the class of at most n intervals on the line scans the points from left to right, saving the first positive example, and then the first ....
Haussler, D., Sloan, R., and Warmuth, M., "Learning Nested Differences of Intersection Closed Concept Classes", Proceedings of the 1989 Workshop on Computational Learning Theory, Morgan Kaufmann, 1989, p.41-56.
....T fc : c 2 C and S cg. A concept class C is intersection closed 2 if whenever S is a finite subset of some concept in C then the closure of S, denoted as CLOS(S) is a concept of C. Clearly axis parallel rectangles in R n are intersection closed and there are many other examples given by Helmbold, Sloan, and Warmuth (1990, 1992) such as monomials, vector spaces in R n and integer lattices. For any set of examples labeled consistently with some concept in the intersection closed class C, consider the closure of the positive examples. This concept, the smallest concept in C containing those positive examples, is ....
....the whole sample. A minimal spanning set of the positive examples is any minimal subset of the positive SAMPLE COMPRESSION AND LEARNABILITY 7 examples whose closure is the same as the closure of all positive examples. Such a minimal spanning set can be used as a compression set for the sample. Helmbold et al. 1990) have proved that the size of such minimal spanning sets is always bounded by the VC dimension of the intersection closed concept class. Thus, any intersection closed class has a sample compression scheme whose size is at most the VC dimension of the class. Surprisingly, using the methods of ....
[Article contains additional citation context not shown here]
Helmbold, D., Sloan, R., and Warmuth, M. (1990). Learning nested differences of intersection-closed concept classes. Machine Learning 5, 165--196, 1990.
....1 A spanning set of a concept c 2 C with respect to the class C is a set I c having the property that c is the unique most specific concept consistent with the instances in I. This generalizes the notation of a spanning set of an intersection closed class given by Helmbold, Sloan, and Warmuth [10]. Finally we define I(C) for concept class C as follows: I(C) max c2C fjI c j : I c is a minimal spanning set for c with respect to Cg: To provide some intuition for our more general result, we first consider the simple case for which I(C) 1 and the concept class C has a unique most ....
....This algorithm is like the algorithm described in Theorem 4 except that it interleaves the steps described for learning the two corners. We conjecture that condition required in Theorem 8 holds for all intersection closed concept classes. Combined with the result from Helmbold, Sloan, and Warmuth [10] that I(C) vcd(C) would give the following. Conjecture 1 For all intersection closed concept classes sdc(C) vcd(C) We now extend Theorem 8 to concept classes that are made of the disjunction of concepts from a concept class for which it currently applies. Theorem 9 Let C be a concept class ....
David Helmbold, Robert Sloan, and Manfred K. Warmuth. Learning nested differences of intersection-closed concept classes. Machine Learning, 5:165--196, 1990. Special issue for COLT 89.
....Of course, it would be nicer to discover a good algorithm than to discover a hardness proof. Another important open class involves unions of boxes. Algorithms exist for learning arbitrary unions of axis parallel rectangles in E 2 , and arbitrary nested differences of axis parallel boxes in E n [8, 15]. Arbitrary unions of boxes (regardless of whether they are axis parallel) in E n cannot be pac learnable since the VC dimension is not polynomial in n. It is open whether unions of up to O(n) axis parallel boxes (or even of O(log n) in E n can be pac learned. 7 This concept class is of ....
David Helmbold, Robert Sloan, and Manfred K. Warmuth. Learning nested differences of intersection-closed concept classes. Machine Learning, 5(2):165--196, 1990.
No context found.
David Helmbold, Robert Sloan, and Manfred K. Warmuth. Learning nested differences of intersection-closed concept classes. In Proceedings of the 2nd Annual Workshop on Computational Learning Theory, pages 41--56, 1989.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC