| Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 391--398, July 1992. |
....and the output hypothesis need only have error rate slightly less than 1 2. In other words, the output of a weak learning algorithm need only perform slightly better than random guessing. A fundamental and surprising result first shown by Schapire [28, 29] and later improved upon by Freund [14, 15] states that any algorithm which eflciently weakly learns can be transformed into an algorithm which eflciently strongly learns. These results have important consequences for PAC learning, including providing upper bounds on the time and sample complexities of strong learning. One criticism of ....
....We define weak SQ learning in a manner analogous to weak PAC learning, and we show that it is possible to boost the accuracy of weak SQ algorithms to obtain strong SQ algorithms. Thus, we show that weak SQ learning is equivalent to strong SQ learning. We use the technique of boosting by majority [15] which is nearly optimal in terms of its dependence on the accuracy parameter e. In the SQ model, as in the PAC model, this boosting result allows us to derive general upper bounds on many complexity measures of learning. Specifically, we derive simultaneous upper bounds with respect to e on the ....
[Article contains additional citation context not shown here]
Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual A CM Workshop on Computational Learning Theory, pages 391-398. ACM Press, 1992.
....of the class under any distribution. Therefore, finding a weak PAC learning algorithm for DNF This research was supported in part by NSERC of Canada. formulas under any distribution is as hard as finding a (strong) PAC learning algorithm for DNF formulas under any distribution. Freund [F90,F92] showed that weak PAC learning a class under any distribution D that is poly away from D, i.e. satisfies D=poly(n) D Dpoly(n) implies PAC learning under the distribution D. Jackson [J94] showed that DNF is weakly learnable under any distribution that is poly away from the uniform ....
Y. Freund. An improved boosting algorithm and its implications on learning complexity, Proc. 5th Annu. Workshop on Comput. Learning Theory, ACM Press, New York, NY, 1992, 391--398.
....and the output hypothesis need only have error rate slightly less than 1 2. In other words, the output of a weak learning algorithm need only perform slightly better than random guessing. A fundamental and surprising result first shown by Schapire [28, 29] and later improved upon by Freund [14, 15] states that any algorithm which efficiently weakly learns can be transformed into an algorithm which efficiently strongly learns. These results have important consequences for PAC learning, including providing upper bounds on the time and sample complexities of strong learning. One criticism of ....
....We define weak SQ learning in a manner analogous to weak PAC learning, and we show that it is possible to boost the accuracy of weak SQ algorithms to obtain strong SQ algorithms. Thus, we show that weak SQ learning is equivalent to strong SQ learning. We use the technique of boosting by majority [15] which is nearly optimal in terms of its dependence on the accuracy parameter ffl. In the SQ model, as in the PAC model, this boosting result allows us to derive general upper bounds on many complexity measures of learning. Specifically, we derive simultaneous upper bounds with respect to ffl on ....
[Article contains additional citation context not shown here]
Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 391--398. ACM Press, 1992.
.... and Mansour [26] can be used to nd such a parity (using membership queries) Combining these two facts gives that the KM algorithm weakly learns DNF with respect to uniform [9] An obvious method to consider for turning this weak learner into a strong learner is some form of hypothesis boosting [29, 18, 17, 19]. In fact, HS is based on a particularly simple and ecient version of boosting discovered by Freund [18] Each stage i of Freund s boosting algorithm explicitly de nes a distribution D i and calls on a weak learner to produce a weak approximator with respect to D i . Distribution D i is de ned in ....
Y. Freund, An improved boosting algorithm and its implications on learning complexity, in Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 1992, pp. 391-398.
....is poly(n; log(m) Lecture 14 : October 31, 1994 14 3 14.2.4 An algorithm with a single majority gate Consider learning from a single majority gate. In other words, where our output is the result of a single majority function applied to some set of k hypothesis. This technique, due to Freund [1], gives the best theoretic results known to date. Let fl = 1=p(n) and our committee size, k = 1= 2fl 2 ) log(1= Consider training the particular hypothesis, h i 1 . Let r be the number of the first i committee members which were correct on the current example. We wish to decide whether to ....
Y. Freund. An improved boosting algorithm and its implications on learning complexity. Proceedings of the 5th Workshop on Computational Learning Theory, pages 391-398, ACM Press, New York, NY, 1992.
....factor of P Phi , implying that ae = Omega Gamma331 6 General Bounds on the Complexity of Relative Error Statistical Query Learning In this section we show general upper bounds on the complexity of relative error statistical query learning. We do so by applying accuracy boosting techniques [10, 11, 18] and specifically, these techniques as applied in the statistical query model [2] Theorem 10 If a concept class F is SQ learnable by an algorithm A using hypothesis class H, then F is SQ learnable with O(N 0 log 2 (1= queries each with threshold Omega Gamma 0 0 = log(1= and ....
Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 391--398. ACM Press, 1992.
....bound on the Sieve from roughly O(ns 8 = 1 2) to O(ns 4 = 4 ) As Klivans and Servedio have pointed out [16] replacing the original Sieve s boosting algorithm with an alternative boosting algorithm can produce further improvement. In particular, one of Freund s boosting algorithms [11] (called B Comb by Klivans and Servedio) will call the weak parity algorithm with jgj bounded by O(1= and bounded as above, yet still runs for only O(s 2 ) boosting stages. This brings the time bound for the overall algorithm down to O(ns 4 = 2 ) but at the expense of a ....
Yoav Freund. An Improved Boosting Algorithm and Its Implications on Learning Complexity. In Proceedings of the 5th Ann. Workshop on Computational Learning Theory, 391-398, 1992.
....of Freund s boosting algorithms, choosing this particular method because the distributions it uses when boosting with respect to uniform have simple closed forms and are efficiently computable. 6 Freund proves the following: 6 Jackson points out that some of Freund s more efficient algorithms [Fre92] can also be used to learn DNF, but the analysis involved is more complicated. 31 Lemma 5.9 ( Fre90] Let WL be a weak learner for the function class C. That is, given access to oracle Ex(f; D) for any f 2 C and any distribution D) and confidence parameter ffi , WL produces in polynomial time, ....
Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 391--398, 1992. 54
....to the space. 2 The word polynomial in the title of [Byl93] means polynomial in the inverse of the separation parameter, which as noted above can be exponential in n even when points are chosen from f0; 1g n . 3 Thanks to Rob Schapire for pointing out that standard Boosting results [Sch90, Fre92] do not apply in the 2 Theorem 1 The class of linear threshold functions in R n can be learned in polynomial time in the PAC prediction model in the presence of random classi cation noise. Remark: The learning algorithm can be made to t the Statistical Query learning model [Kea93] The main ....
Y. Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 391-398. ACM Press, 1992.
....algorithm for our halfspace learning problem. In this section we use techniques from boosting and large margin classification to obtain a strong learning algorithm with small sample complexity. 4. 1 BOOSTING TO ACHIEVE HIGH ACCURACY In a series of important papers Schapire [31] and Freund [10, 11] have given boosting algorithms which transform weak learning algorithms into strong ones. In this paper we use the Adaboost algorithm from [13] which is shown in Figure 3; our notation for the algorithm is similar to that of [34, 35] The input to Adaboost is a sequence S = hx 1 ; y 1 i; ....
Y. Freund. An improved boosting algorithm and its implications on learning complexity, in "Fifth Ann. Work. on Comp. Learning Theory" (1992), 391-398.
.... model is an algorithm which, given access to random labelled examples hx; f(x)i drawn from any distribution D; can generate a hypothesis h such that Pr x2D [f(x) h(x) 1 for any 0; while a weak learning algorithm [22] can only do this for some 1=2 0: Schapire [25] and then Freund [10, 11] gave boosting algorithms which convert weak learners into strong learners, thus proving the equivalence of weak and strong learnability. Since then, boosting has been applied in a wide variety of contexts and continues to be an active area of research [6, 7, 8, 9, 13, 20, 26] All known boosting ....
.... and Valiant in [22] We will abuse notation and say that A is a (1=2 ) approximate learning algorithm for f if A is a (1=2 ) approximate learning algorithm for the concept class C which consists of the single function f: In a series of important results, Schapire [25] and subsequently Freund [10, 11] have shown that if A is a weak learning algorithm for a concept class C; then there exists a strong learning algorithm for C: Their proofs are highly constructive in that they give explicit boosting algorithms which transform weak learning algorithms into strong ones. We now formally define ....
[Article contains additional citation context not shown here]
Y. Freund. An improved boosting algorithm and its implications on learning complexity. In "Fifth Annual Workshop on Computational Learning Theory," (1992), pp.391-398.
.... there exists a family of distributions D such that any D 2 D is learnable with a generator to within error ffl, for any ffl 0, but the learned generator must be of size at least Omega Gammaa =ffl) This is in contrast to the distribution free PAC model, where the results of Schapire and Freund [30, 12] on precision boosting imply that the size of the hypothesis can always be be much smaller than the number of samples, and be only polynomial in log 1=ffl. The family Dn we construct may be based on any pseudo random generator. A member of Dn is defined by a seed s to a pseudo random sequence ....
Y. Freund, An improved boosting algorithm and its implication on learning complexity, Proc. 5th ACM Workshop on Computational Learning Theory, 1992, pp. 391--398.
.... and Mansour [26] can be used to find such a parity (using membership queries) Combining these two facts gives that the KM algorithm weakly learns DNF with respect to uniform [9] An obvious method to consider for turning this weak learner into a strong learner is some form of hypothesis boosting [29, 18, 17, 19]. In fact, HS is based on a particularly simple and e#cient version of boosting discovered by Freund [18] Each stage i of Freund s boosting algorithm explicitly defines a distribution D i and calls on a weak learner to produce a weak approximator with respect to D i . Distribution D i is defined ....
<F3.797e+05> Y.<F3.838e+05> Freund,<F3.808e+05> An improved boosting algorithm and its implications on learning<F3.838e+05> complexity, in Proceedings 5th Annual Workshop on Computational Learning Theory, Pittsburgh, PA, 1992, pp. 391--398.
....n and minimum distance d. 3 LEARNING DNF In this section we develop more efficient algorithms for learning DNF under the uniform distribution. Our algorithms are based on Jackson s Harmonic Sieve for learning DNF expressions [J97] The Sieve itself is based on one of Freund s boosting algorithms [F90, F93]. This boosting algorithm runs in stages. In each stage it creates a new distribution and assumes that the learner can find a weak hypothesis for the target (one that (1=2 Gamma fl) approximates the target) under this distribution. The algorithm performs O(fl Gamma2 log(1=ffl) stages. ....
Y. Freund. An Improved Boosting Algorithm and Its Implications on Learning Complexity. In Proceedings of the 5th Ann. Workshop on Computational Learning Theory, 391-398, 1992.
....1984) In fact, he shows that arbitrarily high accuracy can be achieved by recursively applying the same procedure. Although his approach is limited to the PAC model of learning, some success was achieved in the domain of character recognition, using neural networks (Drucker et al. 1993) Freund (1992) has a similar approach, but with potentially many more sequentially generated distributions involved. 19 Hansen and Salamon (1990) integrate an ensemble of neural networks by simple voting. The different networks in an ensemble are generated by randomized parameters. Kwok and Carter (1990) ....
Freund, Y. (1992). An improved boosting algorithm and its implications on learning complexity. Proc. 5th Work. Comp. Learning Theory (pp. 391--398).
....to the space. 2 The word polynomial in the title of [Byl93] means polynomial in the inverse of the separation parameter, which as noted above can be exponential in n even when points are chosen from f0; 1g n . 3 Thanks to Rob Schapire for pointing out that standard Boosting results [Sch90, Fre92] do not apply in the Theorem 1 The class of linear threshold functions in R n can be learned in polynomial time in the PAC prediction model in the presence of random classification noise. Remark: The learning algorithm can be made to fit the Statistical Query learning model [Kea93] The main ....
Y. Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 391--398. ACM Press, 1992.
....[14] proved that a concept class C can be weakly learned in polynomial time if and only if it can be strongly learned in polynomial time. More precisely, he gives an efficient strong learning algorithm for C that uses an efficient weak learning algorithm for C as a subroutine. Subsequently, Freund [7, 8] has given a different technique for converting a weak learning algorithm into a strong learning algorithm. Combining this result with the lower bound provided by Blumer et al. one obtains an initial lower bound on weak learning sample complexity. This bound does not give an unconditional lower ....
....described by Haussler et al. 9] Note that the output hypothesis can be encoded by the m(n) examples on which A was successfully trained. Under this encoding, the size s(n) of the output hypothesis in bits is m(n) times the number of bits needed to encode each example. Schapire [14] and Freund [7, 8] describe techniques for converting this weak learning algorithm into a strong learning algorithm A 0 outputting hypotheses of size O i s(n) Delta (p(n) ff Delta (log(1=ffl) fi j (1) for some constants ff and fi. If A 0 is run against a uniform distribution over a shattered set of ....
Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 391--398, July 1992.
....a constant factor of P Phi , implying that ae = Omega Gamma 29 6 General Bounds on the Complexity of Relative Error Statistical Query Learning In this section we show general upper bounds on the complexity of relative error statistical query learning. We do so by applying boosting techniques [9, 10, 16] and specifically, these techniques as applied in the statistical query model [1] The proof of Theorem 8 is given in the appendix. Theorem 8 If a concept class F is SQ learnable by algorithm A, then F is SQ learnable with O Gamma N 0 log 2 (1= Delta queries each with threshold Omega ....
Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 391--398. ACM Press, 1992.
.... to depend on ffl and in Section 4.5 that is actually a constant in many algorithms. 4.3 General Bounds on Relative Error SQ Learning In this section we prove general upper bounds on the complexity of relative error statistical query learning. We do so by applying boosting techniques [9, 10, 17] and specifically, these techniques as applied in the statistical query model [4] We first prove some useful lemmas which allow us to decompose relative estimates of ratios and sums. Lemma 3 Let a = b=c where 0 a; b; c 1. If an estimate of a is desired with ( error provided that c Phi, ....
Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 391--398. ACM Press, 1992.
....learning algorithm. An additional feature of Schapire s result is that it is constructive: obtaining weak algorithms from strong ones is trivial, and Schapire shows how to use a weak learning algorithm for a class as a subroutine in order to construct a strong learning algorithm for the class. Freund (1990; 1992) also gives results on boosting weak PAC algorithms to strong PAC algorithms. These results employ a somewhat simpler and more efficient construction than those of Schapire. In each case, the boosting methods can be used to prove general upper bounds, with respect to , on the complexity of PAC ....
....learning in a manner analogous to weak PAC learning, and in Section 4 1.2, we show that it is possible to boost the accuracy of weak SQ algorithms to obtain strong SQ algorithms. Thus, we show that weak SQ learning is equivalent to strong SQ learning. We use the technique of boosting by majority (Freund, 1992) (described in Section 4 1.1) which is nearly optimal in terms of its dependence on the accuracy parameter . In the SQ model, as in the PAC model, a boosting result allows us to derive general upper bounds on many complexity measures of learning. In Section 4 1.3, we derive simultaneous upper ....
[Article contains additional citation context not shown here]
Freund, Yoav. (1992). An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 391--398. ACM Press.
....accuracy than even the best individual expert or an induced combination rule. Keywords: Combining Classifiers, Inductive Learning, Intermediate Concepts, Reliability Models Introduction Using multiple learned models for increasing learning accuracy has attracted much recent interest (Freund 1990; Chan Stolfo 1995; Ali Pazzani 1996) A central problem is how to integrate several classifiers to produce a single final classification. One approach is to use the single most reliable classifier (as in CVM (Schaffer 1993; Merz 1995) or to use a form of weighted voting (eg, SAM (Merz ....
Freund, Y. 1990. An improved boosting algorithm and its implications on learning complexity. In Proc. Fifth Workshop on Computational Learning Theory.
....of correctly classified examples 3.5 Weak and Strong Distribution Restricted Learning Schapire [17] shows that for any class of target functions C, if there exists a distribution free weak learning algorithm for C, then there exists a distribution free strong learning algorithm for C. Freund [8] asks how much we can relax the requirement that the weak learning algorithm work for all distributions. In what cases can a D distribution restricted weak learning algorithm for concept class C be boosted to a strong learning algorithm It would seem that in general, we could not boost the weak ....
Y. Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of COLT '92, pages 391--398. Morgan Kaufmann, 1992.
....our boosting algorithm can be used with such algorithms, and that the accuracy of the hypothesis that it outputs is proportional to the sensitivity of the given learning algorithm to changes in the distribution of the instances. Parts of this work were previously published in [Freund, 1990] and [Freund, 1992]. 1.2 Query By Committee As we have discussed in the previous section, all random training examples are not created equal. In fact, there is often a very small fraction of the training examples whose labels carry all the information that is relevant for approximating the hidden concept, and ....
Y. Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Workshop on Computational Learning Theory, pages 391--398, San Mateo, CA, 1992. Morgan Kaufmann.
No context found.
Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 391--398, July 1992.
No context found.
Yoav Freund. An improved boosting algorithm and its implications on learning complexity. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 391--398, July 1992.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC