Results 11  20
of
97
On Weak Learning
 Journal of Computer and System Sciences
, 1995
"... This paper presents relationships between weak learning, weak prediction (where the probability of being correct is slightly larger than 50%), and consistency oracles (which decide whether or not a given set of examples is consistent with a concept in the class). Our main result is a simple polynomi ..."
Abstract

Cited by 54 (10 self)
 Add to MetaCart
This paper presents relationships between weak learning, weak prediction (where the probability of being correct is slightly larger than 50%), and consistency oracles (which decide whether or not a given set of examples is consistent with a concept in the class). Our main result is a simple polynomial prediction algorithm which makes only a single query to a consistency oracle and whose predictions have a polynomial edge over random guessing. We compare this prediction algorithm with several of the standard prediction techniques, deriving an improved worst case bound on Gibbs Algorithm in the process. We use our algorithm to show that a concept class is polynomially learnable if and only if there is a polynomial probabilistic consistency oracle for the class. Since strong learning algorithms can be built from weak learning algorithms, our results also characterizes strong learnability.
Learning with Restricted Focus of Attention
, 1997
"... We consider learning tasks in which the learner faces restrictions on the amount of information he can extract from each example he encounters. We introduce a formal framework for the analysis of such scenarios. We call it RFA (Restricted Focus of Attention) learning. While being a natural refine ..."
Abstract

Cited by 45 (2 self)
 Add to MetaCart
We consider learning tasks in which the learner faces restrictions on the amount of information he can extract from each example he encounters. We introduce a formal framework for the analysis of such scenarios. We call it RFA (Restricted Focus of Attention) learning. While being a natural refinement of the PAC learning model, some of the fundamental PAClearning results and techniques fail in the RFA paradigm; learnability in the RFA model is no longer characterized by the VC dimension, and many PAC learning algorithms are not applicable in the RFA setting. Hence, the RFA formulation reflects the need for new techniques and tools to cope with some fundamental constraints of realistic learning problems. In this work we also present some paradigms and algorithms that may serve as a first step towards answering this need. Two main types of restrictions are considered here  in the stronger one, called kRFA, only k of the n attributes of each example are revealed to the learner, while in the weakest one, called kwRFA, the restriction is made on the size of each observation (k bits), and no restriction is made on how the observations are extracted from the examples. For the stronger kRFA restriction we develop a general technique for composing efficient kRFA algorithms, and apply it to deduce, for instance, the efficient kRFA learnability of kDNF formulas, and the efficient 1RFA learnability of axisaligned rectangles in the Euclidean space R n . We also prove the kRFA learnability of richer classes of Boolean functions (such as kdecision lists) with respect to a given distribution, and the efficient (n \Gamma 1)RFA learnability (for fixed n), under product distributions, of classes of subsets of R n which are defined by mild surfaces. ...
Probably Approximately Correct Learning
 Proceedings of the Eighth National Conference on Artificial Intelligence
, 1990
"... This paper surveys some recent theoretical results on the efficiency of machine learning algorithms. The main tool described is the notion of Probably Approximately Correct (PAC) learning, introduced by Valiant. We define this learning model and then look at some of the results obtained in it. We th ..."
Abstract

Cited by 43 (1 self)
 Add to MetaCart
(Show Context)
This paper surveys some recent theoretical results on the efficiency of machine learning algorithms. The main tool described is the notion of Probably Approximately Correct (PAC) learning, introduced by Valiant. We define this learning model and then look at some of the results obtained in it. We then consider some criticisms of the PAC model and the extensions proposed to address these criticisms. Finally, we look briefly at other models recently proposed in computational learning theory. 2 Introduction It's a dangerous thing to try to formalize an enterprise as complex and varied as machine learning so that it can be subjected to rigorous mathematical analysis. To be tractable, a formal model must be simple. Thus, inevitably, most people will feel that important aspects of the activity have been left out of the theory. Of course, they will be right. Therefore, it is not advisable to present a theory of machine learning as having reduced the entire field to its bare essentials. All ...
Learning from Positive and Unlabeled Examples
 Procs. of the 11th International Conference on Algorithmic Learning Theory
, 2000
"... In many machine learning settings, examples of one class (called positive class) are easily available. Also, unlabeled data are abundant. ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
In many machine learning settings, examples of one class (called positive class) are easily available. Also, unlabeled data are abundant.
Polynomialtime Learning of Elementary Formal Systems
 Theoretical Computer Science
, 2000
"... An elementary formal system (EFS) is a logic program con sisting of definite clauses whose arguments have patterns instead of firstorder terms. We investigate EFSs for polynomialtime PAClearnability. A definite clause of an EFS is hereditary if every pattern in the body is a subword of a pat ..."
Abstract

Cited by 38 (8 self)
 Add to MetaCart
An elementary formal system (EFS) is a logic program con sisting of definite clauses whose arguments have patterns instead of firstorder terms. We investigate EFSs for polynomialtime PAClearnability. A definite clause of an EFS is hereditary if every pattern in the body is a subword of a pattern in the head. With this new notion, we show that HEFS(ra, k, t, r) is polynomialtime learnable, which is the class of languages definable by EFSs consisting of at most ra hereditary definite clauses with predicate symbols of arity at most r, where k and t bound the number of variable occurrences in the head and the number of atoms in the body, respectively. The class defined by all finite unions of EFSs in HEFS(ra, k, t, r) is also polynomialtime learnable. We also show an interesting series of NClearnable classes of EFSs. As hardness results, the class of regular pattern languages is shown not polynomialtime learnable unless RP=NP. Furthermore, the related problem of deciding whether there is a common subsequence which is consistent with given positive and negative examples is shown NPcomplete.
Can PAC Learning Algorithms Tolerate Random Attribute Noise?
 Algorithmica
, 1995
"... This paper studies the robustness of PAC learning algorithms when the instance space is f0; 1g n , and the examples are corrupted by purely random noise affecting only the attributes (and not the labels). For uniform attribute noise, in which each attribute is flipped independently at random with ..."
Abstract

Cited by 36 (6 self)
 Add to MetaCart
(Show Context)
This paper studies the robustness of PAC learning algorithms when the instance space is f0; 1g n , and the examples are corrupted by purely random noise affecting only the attributes (and not the labels). For uniform attribute noise, in which each attribute is flipped independently at random with the same probability, we present an algorithm that PAC learns monomials for any (unknown) noise rate less than 1=2. Contrasting this positive result, we show that product random attribute noise, where each attribute i is flipped randomly and independently with its own probability p i , is nearly as harmful as malicious noiseno algorithm can tolerate more than a very small amount of such noise. Supported in part by a GE Foundation Junior Faculty Grant and NSF grant CCR9110108. Part of this research was conducted while the author was at the M.I.T. Laboratory for Computer Science and supported by NSF grant DCR8607494 and a grant from the Siemens Corporation. Net address: sg@cs.wustl.edu....
LEARNING BINARY RELATIONS AND TOTAL ORDERS
, 1993
"... The problem of learning a binary relation between two sets of objects or between a set and itself is studied. This paper represents a binary relation between a set of size n and a set of size rn as an n rn matrix of bits whose (i, j) entry is if and only if the relation holds between the correspond ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
(Show Context)
The problem of learning a binary relation between two sets of objects or between a set and itself is studied. This paper represents a binary relation between a set of size n and a set of size rn as an n rn matrix of bits whose (i, j) entry is if and only if the relation holds between the corresponding elements of the two sets. Polynomial prediction algorithms are presented for learning binary relations in an extended online learning model, where the examples are drawn by the learner, by a helpful teacher, by an adversary, or according to a uniform probability distribution on the instance space. The first part of this paper presents results for the case in which the matrix of the relation has at most k row types. It presents upper and lower bounds on the number of prediction mistakes any prediction algorithm makes when learning such a matrix under the extended online learning model. Furthermore, it describes a technique that simplifies the proof of expected mistake bounds against a randomly chosen query sequence. In the second part of this paper the problem of learning a binary relation that is a total order on a set is considered. A general technique using a fully polynomial randomized approximation scheme (fpras) to implement a randomized version of the halving algorithm is described. This technique is applied to the problem of learning a total order, through the use of an fpras for counting the number of extensions of a partial order, to obtain a polynomial prediction algorithm that with high probability makes at most n lg n + (lg e)lg n mistakes when an adversary selects the query sequence. The case in which a teacher or the learner selects the query sequence is also considered
Agnostic boosting
 In Proceedings of the 14th Annual Conference on Computational Learning Theory
, 2001
"... Martingale boosting is a simple and easily understood technique with a simple and easily understood analysis. A slight variant of the approach provably achieves optimal accuracy in the presence of misclassification noise. 1 ..."
Abstract

Cited by 34 (7 self)
 Add to MetaCart
Martingale boosting is a simple and easily understood technique with a simple and easily understood analysis. A slight variant of the approach provably achieves optimal accuracy in the presence of misclassification noise. 1
Results on Learnability and the VapnikChervonenkis Dimension
, 1991
"... We consider the problem of learning a concept from examples in the distributionfree model by Valiant. (An essentially equivalent model, if one ignores issues of computational difficulty, was studied by Vapnik and Chervonenkis.) We introduce the notion of dynamic sampling, wherein the number of examp ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
We consider the problem of learning a concept from examples in the distributionfree model by Valiant. (An essentially equivalent model, if one ignores issues of computational difficulty, was studied by Vapnik and Chervonenkis.) We introduce the notion of dynamic sampling, wherein the number of examples examined may increase with the complexity of the target concept. This method is used to establish the learnability of various concept classes with an infinite VapnikChervonenkis dimension. We also discuss an important variation on the problem of learning from examples, called approximating from examples. Here we do nor assume that the target concept T is a member of the concept class %? from which approximations are chosen. This problem takes on particular interest when the VC dimension of V is infinite. Finally, we discuss the problem of computing the VC dimension of a finite concept set defined on a tinite domain and consider the structure of classes of a fixed small dimension.
Agnostic Learning of Geometric Patterns
 Journal of Computer and System Sciences
, 1997
"... Goldberg, Goldman, and Scott demonstrated how the problem of recognizing a landmark from a onedimensional visual image can be mapped to that of learning a onedimensional geometric pattern and gave a PAC algorithm to learn that class. In this paper, we present an efficient online agnostic learning ..."
Abstract

Cited by 29 (15 self)
 Add to MetaCart
(Show Context)
Goldberg, Goldman, and Scott demonstrated how the problem of recognizing a landmark from a onedimensional visual image can be mapped to that of learning a onedimensional geometric pattern and gave a PAC algorithm to learn that class. In this paper, we present an efficient online agnostic learning algorithm for learning the class of constantdimension geometric patterns. Our algorithm can tolerate both classification and attribute noise. By working in higher dimensional spaces we can represent more features from the visual image in the geometric pattern. Our mapping of the data to a geometric pattern, and our hence our learning algorithm, is applicable to any data representable as a constantdimensional array of values, e.g. sonar data, temporal difference infor...