102 citations found. Retrieving documents...
M. Kearns, R. Schapire, and L. Sellie. Towards efficient agnostic learning. Machine Learning, 17:115--141, 1994.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Correlation Clustering - Bansal, Blum, Chawla (2002)   (15 citations)  (Correct)

....in this language, then we have to say (u; w) is positive too. So, we might not be able to represent f perfectly. This sort of problem trying to find the (nearly) best representation of some arbitrary target f in a given limited hypothesis language is sometimes called agnostic learning [17, 6]. The observation that one can trivially agree with at least half the edge labels is equivalent to the standard machine learning fact that one can always achieve error at most 1=2 using either the all positive or all negative hypothesis. Our PTAS for approximating the number of agreements means ....

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. Machine Learning, 17(2/3):115-- 142, 1994.


On the Generalization Ability of Recurrent Networks - Hammer (2001)   (Correct)

....directly: restricted transition functions or restricted input distributions, respectively. Afterwards, we derive posterior bounds on the generalization capability which depend on the concrete training set. For this purpose we generalize the luckiness framework to the general agnostic setting [10, 12, 16]. Unlike in [9] we obtain results which cover long term prediction and allow the restriction to representative parts of data. 2 The Learning Scenario A single layer FNN computes a function f : R ; x 7 (A x ) where A 2 R n m , 2 R , and denotes the componentwise application of ....

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward Efficient Agnostic Learning. Machine Learning, 17, 1994.


Robust Trainability of Single Neurons - Höffgen, Simon, Van Horn (1995)   (30 citations)  (Correct)

....0 ) as the measure of our learning success. This means we use the class H 0 as a touchstone class to define the goal of the learning problem, but allow a hypotheses class H H 0 , which provides the hypotheses of our learning algorithm. This framework was introduced by Kearns et.al. in [18]. It is especially useful to overcome representational problems inherent to a given touchstone class. We combine these in the following definitions: ffl We say that the H 0 degradation of a learning algorithm L for C; H is limited by d if, in the confident case, L outputs a hypothesis h 2 H ....

M. J. Kearns, R. E. Schapire, and L. M. Sallie, Toward efficient agnostic learning, in "Proceedings of the 5th Annual Workshop on Computational Learning Theory, 1992," pp. 341--353.


Cross-Validation for Binary Classification by Real-Valued.. - Anthony, Holden (1999)   (1 citation)  (Correct)

.... respect to P ) is er P (f) P (f(x; y) 2 X Theta f Gamma1; 1g : sgn(f(x) 6= yg) The use of real valued functions for binary classification has been considered in [4, 7, 8, 24] within versions of the PAC model of learning and versions of agnostic PAC learning (see Kearns et al. [19] and Haussler [14] and it has been shown that there are advantages in considering the values of the real function during training, rather than merely its sign. In particular, as suggested in [13, 27] classification with a large margin (the distance between the classification boundary and ....

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. In Proceedings of the 5th Annual Workshop on Computational Learning Theory, pages 341--352. ACM Press, New York, NY, 1992.


Sample-efficient Strategies for Learning in the.. - Cesa-Bianchi.. (1999)   (Correct)

....ae is shown in Figure 4. In the same figure we also plot return to investment ratio, showing that the best strategy for the adversary is to balance the labels whence this ratio becomes 1=2. Note that, as our final goal is to bound the 4 The function H was also used by Kearns, Schapire and Sellie [6] in connection with agnostic learning. Malicious Noise and Randomized Hypotheses 24 0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 p Figure 4 Curve of the coin rule H(p; q) q 2 = p 2 q 2 ) thin) and of the return to investment ratio H(p; q) Delta p=q (thick) the curves are scaled to p ....

Michael J. Kearns, Robert E. Schapire, and Linda M. Sellie. Toward efficient agnostic learning. Machine Learning, 17(2/3):115--142, 1994.


Learning Polynomials With Queries: The Highly Noisy Case - Goldreich, Rubinfeld, Sudan (1995)   (22 citations)  (Correct)

....Research supported in part by a Sloan Foundation Fellowship and NSF Career Award CCR 9875511. Email: madhu mit.edu. 1 2 O. GOLDREICH, R. RUBINFELD, and M. SUDAN A second interpretation of the reconstruction problem is within the framework of agnostic learning introduced by Kearns et al. [23] (see also [29, 30, 24] In the setting of agnostic learning, the learner is to make no assumptions regarding the natural phenomenon underlying the input output relationship of the function, and the goal of the learner is to come up with a simple explanation that best fits the examples. Therefore ....

Michael J. Kearns, Robert E. Schapire, and Linda M. Sellie. Toward efficient agnostic learning (extended abstract). Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pp. 341-352, Pittsburgh, Pennsylvania, ACM Press, 1992.


Learning with Restricted Focus of Attention - Ben-David, Dichterman (1997)   (Correct)

....for hidden variables learning problems in the p concept model. In [19] it is shown how to learn any Boolean formula of the form , where is a conjunction over the set of visible variables, and is any Boolean function over the set of hidden variables. Kearns, Schapire and Sellie show in [20] how to learn any k term DNF formula (DNF formula with up to k terms; see Chapter 3 for the definition of DNF formulas) where the formula may depend on both the hidden and the visible variables. As mentioned above, we are mainly concerned with a different type of hidden variables learning ....

Michael J. Kearns, Robert E. Schapire, and Linda M. Sellie. Towards efficient agnostic learning. In Proceedings of the 5th Annual Workshop on Computational Learning Theory, pages 341--352, 1992.


On the Difficulty of Approximately Maximizing Agreements - Shai Ben-David Nadav   (Correct)

....half spaces, axis aligned hyper rectangles, balls, monomials. 3 For a list of typographical symbols used, please see LaTex, by Leslie Lamport. 4 1 Introduction We study the computational complexity of agnostic learning with a variety of common hypothesis classes. The agnostic framework [14, 20] is a very useful variant of the PAC learning model in which, informally, the learning algorithm is required to do nearly as well as is possible using hypotheses from a given class. Haussler s work [14] see also [22] implies that learnability in this model is, in a sense, equivalent to the ....

....and Kann [1] showed that approximately maximizing agreements using half spaces is APX complete. Also, it is not too hard to see that the following facts ffl weak learning implies strong learning [23] ffl half spaces, balls, monomials and axis aligned rectangles are weak approximators to DNF (see [20]) together imply that a fully polynomial time approximation scheme for approximately maximizing agreements using any of half spaces, balls, monomials or axis aligned rectangles would imply the learnability of DNF. In this paper we consider several common concept classes monomials and monotone ....

[Article contains additional citation context not shown here]

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. Machine Learning, 17:115--141, 1994.


Agnostic Boosting - Ben-David, Long, Mansour   (Correct)

....close to zero as possible. The second allows an arbitrary target function, but rather than shooting for absolute success, compares the error of the learner s hypothesis to that of the best predictor in some pre specified comparison class of predictors. This model is also known as agnostic learning [8]. When one tries to consider which model is more realistic, it has to be the case that the agnostic model wins. We rarely know if there is a clear target function, let al..one if it belongs to some simple class of hypotheses. The aim of this paper is to study the boosting question in an agnostic ....

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. Machine Learning, 17:115--141, 1994.


PAC Learning with Nasty Noise - Bshouty, Eiron, Kushilevitz (1999)   (3 citations)  (Correct)

....from the sample) Another situation to which our model is related is the setting of Agnostic Learning. In this model, a concept class is not given. Instead, the learning algorithm needs to minimize the empirical error while using a hypothesis from a predefined hypotheses class (see, for example, [18] for a definition of the model) Assuming the best hypothesis classifies the input up to an j fraction, we may alternatively see the problem as that of learning the hypotheses class under nasty noise of rate j. However, we note that the success criterion in the agnostic learning literature is ....

M. J. Kearns, R. E. Schapire, and L. M. Sellie, "Toward Efficient Agnostic Learning", Machine Learning, vol. 17(2), pp. 115--142, 1994.


Probabilistic Analysis of Learning in Artificial Neural Networks: .. - Anthony (1994)   (10 citations)  (Correct)

....error among hypotheses from H 0 , but performs well with respect to H . This model of learning, in which the aim is to produce, from a class H 0 with H 0 H , an output hypothesis only slightly worse than the best approximation in H , was introduced by Kearns, Schapire and Sellie [65], who called it agnostic learning. 10 Distribution Specific Learning Perhaps the main attraction of the definition of PAC learning is the distributionfree criterion: the sample complexity is independent of the probability distribution. The proofs of the standard computational hardness results ....

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. In Proc. 5th Annu. Workshop on Comput. Learning Theory, pages 341--352. ACM Press, New York, NY, 1992.


Exploring Applications of Learning Theory to Pattern Matching and.. - Scott (1998)   (1 citation)  (Correct)

....(as in the PAC model) we bound the number of prediction mistakes that the learner makes on the sequence of examples it sees when these examples are adversarially generated. We give an on line algorithm (based on the algorithm Winnow [61] to learn geometric patterns. This algorithm is agnostic [42, 53] in the sense that its error bounds make no assumptions whatsoever about the target concept 4 to be learned (as opposed to our PAC algorithms, whose error bounds break if our assumptions are violated) Our algorithm is also tolerant of concept shift, i.e. if the target concept changes over time, ....

....function: L(y t ; y t ) is 1 if y t 6= y t , and 0 otherwise. The performance of the on line learner is measured by the total loss (which is equivalent to the number of prediction mistakes made when using the discrete loss function) over all trials. Our on line learning algorithms are agnostic [42, 53] in the sense that they make no assumptions whatsoever about the target concept to be learned. Instead, we compare their performance with the performance of the best hypothesis selected from a comparison or touchstone class. For a sequence of trials, the best hypothesis from the touchstone class ....

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. Machine Learning, 17(2/3):115--142, 1994.


Entropy Estimation - Bercher, Vignat (1996)   (78 citations)  (Correct)

....there is not much a prediction algorithm can do. To set a reasonable goal, we measure the performance of the algorithm against the performances of predictors from some fixed comparison class P . The comparison class is analogous to the touchstone class of the agnostic PAC model of learning [KSS94] The algorithm is required to perform well if at least one predictor from the comparison class performs well. At the extremes, the outcomes could be completely random, in which case they can be predicted neither by the algorithm nor any predictor from the comparison class P , or the outcomes ....

Michael J. Kearns, Robert E. Schapire, and Linda M. Sellie. Toward efficient agnostic learning. Machine Learning, 17(2/3):115-- 142, 1994.


Learning from Data of Variable Quality - Koby Crammer Michael   Self-citation (Kearns)   (Correct)

No context found.

M. Kearns, R. Schapire, and L. Sellie. Towards efficient agnostic learning. Machine Learning, 17:115--141, 1994.


An Experimental and Theoretical Comparison of Model.. - Kearns, Mansour, Ng, Ron   (57 citations)  Self-citation (Kearns)   (Correct)

No context found.

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. In Proceedings of the 5th Annual Workshop on Computational Learning Theory, pages 341--352. ACM Press, New York, NY, 1992.


Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out .. - Kearns, Ron (1997)   (42 citations)  Self-citation (Kearns)   (Correct)

No context found.

M. Kearns, R. Schapire, and L. Sellie. Toward efficient agnostic learning. Machine Learning, 17:115--141, 1994.


Efficient Distribution-free Learning of Probabilistic Concepts - Kearns, Schapire (1993)   (108 citations)  Self-citation (Kearns Schapire)   (Correct)

No context found.

Michael J. Kearns, Robert E. Schapire, and Linda M. Sellie. Toward efficient agnostic learning. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pages 341--352, July 1992.


Weakly Learning DNF and Characterizing Statistical - Query Learning Using   Self-citation (Kearns)   (Correct)

No context found.

Michael J. Kearns, Robert E. Schapire, and Linda M. Sellie. Toward efficient agnostic learning. In Fifth pages 341--352, 1992.


Journal of Machine Learning Research 7 (2006) 55--83.. - Michael Schmitt..   (Correct)

No context found.

Michael J. Kearns, Robert E. Schapire, and Linda M. Sellie. Toward efficient agnostic learning. Machine Learning, 17:115--141, 1994.


The VC-Dimension of Subclasses of Pattern Languages - Mitchell, Scheffer, Sharma, .. (1999)   (5 citations)  (Correct)

No context found.

M. Kearns, R. Schapire, and L. Sellie. Towards efficient agnostic learning. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 341--352. ACM Press, 1992.


Property Testing - Ron (2000)   (16 citations)  (Correct)

No context found.

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. Machine Learning, 17(2-3):115--141, 1994.


Cost-Sensitive Learning by Cost-Proportionate Example.. - Zadrozny, Langford, Abe (2003)   (2 citations)  (Correct)

No context found.

Kearns, M., Schapire, R., & Sellie, L. Toward Efficient Agnostic Learning. Machine Learning, 17, 115-141, 1998.


Correlation Clustering - Nikhil Bansal Avrim (2002)   (15 citations)  (Correct)

No context found.

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. Machine Learning, 17(2/3):115-- 142, 1994.


Learning Multivalued Multithreshold Functions - Department   (Correct)

No context found.

M. J. Kearns, R. E. Schapire and L. M. Sellie. Toward efficient agnostic learning. Machine Learning 17(2/3), 1994: 115--142.


Links between Learning and Optimization: a Brief Tutorial - Anthony (2003)   (Correct)

No context found.

M. J. Kearns, R. E. Schapire and L. M. Sellie. Toward efficient agnostic learning. Machine Learning 17(2/3), 1994: 115--142.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC