| P. Auer and M. Warmuth. Tracking the best disjunction. Machine Learning, 1998. To appear in the Special Issue on Context Sensitivity and Concept drift. Earlier version in FOCS '95. |
....its weights additively while Winnow uses multiplicative weight updates. Another major difference between these algorithms is that Winnow s mistake bound is logarithmic 23 in N whereas the Perceptron algorithm s mistake bound can be linear in N in the worst case [55] Recently Auer and Warmuth [12], in generalizing the work of Littlestone [63] showed that Winnow makes at most O(A K log N) mistakes on any sequence of trials where the target K disjunction makes at most A attribute errors. The number of attribute errors of a labeled example hX t ; y t i with respect to the target ....
....In the agnostic model, whenever the best hypothesis makes a prediction mistake, we only need to change at most K attributes of the example so that the classification is consistent. Thus we have the following interpretation of the mistake bound in the presence of attribute errors 8 . Theorem 2. 7 [12] Suppose in a sequence of trials for on line learning an unknown boolean concept defined by K of N possible attributes, the best K disjunction makes M opt mistakes (classification errors) Then Winnow, running with ff = 1:75, each initial weight = 1=N , and = ff ln ff) ff 2 Gamma 1) makes ....
[Article contains additional citation context not shown here]
P. Auer and M. Warmuth. Tracking the best disjunction. Machine Learning, 1998. To appear in the Special Issue on Context Sensitivity and Concept drift. Earlier version in FOCS '95.
....its weights additively while Winnow uses multiplicative weight updates. Another major difference between these algorithms is that Winnow s mistake bound is logarithmic in N whereas the Perceptron algorithm s mistake bound can be linear in N in the worst case [20] Recently Auer and Warmuth [5], in generalizing the work of Littlestone [25] showed that Winnow makes at most O(A K log N) mistakes on any sequence of trials where the target K disjunction makes at most A attribute errors. The number of attribute errors of a labeled example hX t ; y t i with respect to the target ....
..... In the agnostic model, whenever the best hypothesis makes a prediction mistake, we only need to change at most K attributes of the example so that the classification is consistent. Thus we have the following interpretation of the mistake bound in the presence of attribute errors 4 . Theorem 1 [5] Suppose in a sequence of trials for on line learning an unknown boolean concept defined by K of N possible attributes, the best K disjunction makes M opt mistakes (classification errors) Then Winnow, running with ff = 1:75, each initial weight = 1=N , and = ff ln ff) ff 2 Gamma 1) makes ....
[Article contains additional citation context not shown here]
Peter Auer and Manfred Warmuth. Tracking the best disjunction. Machine Learning, 1998. To appear in the Special Issue on Context Sensitivity and Concept drift.
No context found.
P. Auer and M. Warmuth. Tracking the best disjunction. Machine Learning, 1998. To appear in the Special Issue on Context Sensitivity and Concept drift. Earlier version in FOCS '95.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC