| David Haussler. Generalizing the PAC model for neural net and other learning applications. Technical Report UCSC-CRL-89-30, University of California Santa Cruz, Sep 1989. |
....rather than exact values of coalitions are known. One of the prettiest applications of this general analysis is to the s median problem. Approximation algorithms for the s median problem are a useful tool in learning Lipschitz functions in the generalized PAC learning model of Haussler [34, 35]. To approximate a Lipschitz function a memory based learning system can be used, as proposed by Lin and Vitter [42] We generalize the analysis of a greedy approximate solution of the s median problem first considered by Cornuejols et al. 21] We then compare its performance to the performance ....
....produced by the algorithm proposed by Lin and Vitter, the size of which is given by (2.4) 2.4.3 How to Prove That it Works This section gives an outline of Lin and Vitter s correctness proof for the learning algorithm described in Section 2.4.1. First we quote two definitions after Haussler [34, 35]. For r 2 let sign (r) 1 iff r 0, and zero otherwise. Definition 2.4.1 For A ae say A is full if there exists an x 2 such that the set of sign vectors of the following sums is of the maximum size possible jf hsign (x i y i )i i=1 : y 2 Agj = 2 Definition 2.4.2 Let F be a ....
David Haussler. Generalizing the PAC model for neural net and other learning applications. Technical Report UCSC-CRL-89-30, University of California Santa Cruz, Sep 1989.
....Institute of Science, Rehovot 76100, Israel. Footnote 1: Permissibility is a measurability condition for uncountable classes of measurable functions. In the present case, F is permissible (see [14] p. 196) Footnote 2: A free product of k functions fi( is the form (f(x ) f2(x ) fk(x ) [2]. Footnote 3: For the restricted case of orthographic projection and rigid transformation, there exists an algorithm that can learn to recognize an object of this type from just six views, obtained by Gram Schmidt orthonormalization from a few tens of random views [11] 17 80 60 ....
D. Haussler. Generalizing the PAC model for neural net and other learning applications. UCSC-CRL 89-30, U. of California, Santa Cruz, 1989.
.... Suppose now we are given a certain probability distribution function H(x; y) on the space D Theta [b 1 ; b 2 ] and (X 1 ; Y 1 ) X 2 ; Y 2 ) Xn ; Yn ) are chosen independently and randomly according to H(x; y) The following definition of (expected) VC entropy was used by Haussler in [7]. Definition 1 For any ffl 0, the VC entropy of = for samples of size n is defined to be N = ffl; n) j E[N (ffl; X 1 ; Y 1 ) Xn ; Yn ) where the expectation is with respect to the distribution H(x; y) The (expected) VC entropy N = ffl; n) measures, in a certain ....
....proven, that the least squares estimates lead to a consistent estimation, when N = ffl; n) diverges to infinity at most subexponentially, namely log(N = ffl; n) n 0; as n 1. The refinement of Vapnik and Chervonenkis result is the following theorem proven by Pollard [4] and Haussler [7] Proposition 2 There holds Pr n sup f2= fi fi fi Z Q(x; y; f)dH(x; y) Gamma 1 n n X i=1 Q(X i ; Y i ; f ) fi fi fi ffl o 4N = ffl; n)e Gammaffl 2 n=64(b2 Gammab 1 ) 4 : Let H(x) denote the marginal distribution of the input variable X , and let jj Delta jj 2 ....
D.Haussler. Generalizing the PAC Model for Neural Net and Other Learning Applications. University of California Santa Cruz Technical Report UCSC-CRL-89-30.
.... of Haussler [6] and Natarajan [8] We consider functions with nite or countably in nite range, generalising the de nition of the VC dimension [15] We do not deal here with functions which take values in Euclidean space, as we feel these are best analysed using the elegant theory described in [7] of functions taking values in arbitrary metric spaces. In the nal section, we apply the results to a problem in arti cial neural networks. Haussler [7] Baum and Haussler [2] and Natarajan [8] have obtained upper bounds on a sample size guaranteeing valid generalisation in networks of certain ....
....We do not deal here with functions which take values in Euclidean space, as we feel these are best analysed using the elegant theory described in [7] of functions taking values in arbitrary metric spaces. In the nal section, we apply the results to a problem in arti cial neural networks. Haussler [7], Baum and Haussler [2] and Natarajan [8] have obtained upper bounds on a sample size guaranteeing valid generalisation in networks of certain types. We obtain a bound on the (generalised) VC dimension of a feedforward linear threshold net with multiple outputs, extending the result of Baum and ....
[Article contains additional citation context not shown here]
David Haussler, Generalizing the PAC model for neural net and other learning applications, Technical Report UCSC-CRL-89-30, University of California Computer Research Laboratory, Santa Cruz, CA, 1989.
....a bounded number of such erroneous responses, and Frazier and Pitt [FP94] consider learning when such incorrect responses occur randomly with probability at most 1 2 . In other related work, Kearns and Schapire [KS94] generalized the PAC setting to non binary values using Haussler s framework [Hau89]. They define a p concept in which each example x 2 X has some probability p(x) of being classified as positive. In their model, the goal of the learner is to make optimal predictions, or more commonly, to accurately predict p(x) for all x 2 X . One way to compare our model to theirs is to ....
D. Haussler. Generalizing the PAC model for neural net and other learning applications. Technical Report UCSC-CRL-89-30, University of California Santa Cruz, September 1989.
....1 Gamma ffi, finds a concept that has error less than or equal to ffl, using a number of examples that is polynomial in 1=ffl and ffi. Later research has extended this model to analyze the effects of noise [Angluin and Laird, 1986, Kearns and Li, 1987] and to allow arbitrary cost functions [Haussler, 1989]. The machine learning community has generally viewed induction as a problem of searching a space of potential hypotheses to find a consistent one. Michalski s description of the Star system [1983] gives a large set of inference rules, which can be thought of as operators for searching the space. ....
David Haussler. Generalizing the PAC model for neural net and other learning applications. Technical Report UCSC-CRL-89-30, UC Santa Cruz, September 1989.
.... proposed and studied by Angluin [Ang88] Our formulation requires the algorithm to be particularly robust in the sense that we do not assume anything about the target distribution a formulation which is closely related to the robust generalization of the PAC paradigm proposed by Haussler in [Hau89]. The distance measure between the distributions used in this paper to evaluate the accuracy of a hypothesis with respect to the target distribution is the well known Kullback Leibler divergence. Other commonly used measures of distance between probability distributions are, for example, the ....
David Haussler. Generalizing the PAC model for neural net and other learning applications. Technical Report UCSC CRL-89-30, University of California at Santa Cruz, September 1989. Extended abstract appeared in the Proceedings of FOCS '89.
....algorithms are compared against algorithms learning from randomly chosen examples. In general, the number of randomly chosen examples needed to achieve an expected error of no more than ffl scales as O( 1 ffl log 1 ffl ) Blumer et al. 1989; Baum and Haussler 1989; Cohn and Tesauro, 1992; Haussler, 1992]. In some situations, active selection of training examples can reduce the sample complexity to O(log 1 ffl ) 3 although worst case bounds for unconstrained querying are no better than those for choosing at random [Eisenburg and Rivest, 1990] Average case analysis indicates that on many ....
D. Haussler. (1992) Generalizing the pac model for neural nets and other learning applications. Information and Computation, 100(1):78--150.
....Random Sampling vs. Active Learning Most neural network generalization problems are studied only with respect to random sampling: the training examples are chosen at random, and the network is simply a passive learner. This approach is generally referred to as learning from examples. Baum and Haussler (1989), examine the problem analytically for neural networks; Cohn and Tesauro (1992) provide an empirical study of neural network generalization when learning from examples. There have also been a number of empirical efforts, such as Le Cun et al. 1990) aimed at improving neural network ....
.... do we have to draw and classify from an arbitrary distribution P in order to find a concept c 2 C consistent with the examples such that ffl(c; t; P) ffl with confidence at least 1 Gamma ffi This problem was formalized by Valiant (1984) and has been studied for neural networks in (Baum and Haussler, 1989) and (Haussler, 1989) 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 Figure 2: The region of uncertainty, R(S m ) is the set of all points x in the domain such that there are two concepts that are consistent with all training examples in S m and yet disagree on the classification of x. Here, R(S ....
[Article contains additional citation context not shown here]
D. Haussler. (1989) Generalizing the pac model for neural nets and other learning applications. UCSC Tech Report UCSC-CRL-89-30.
....[Valiant, 1984] One of the primary features of the PAC model is that it permits analysis of hypotheses that only approximate the correct solution. The PAC model has been extended in many fruitful ways (e.g. Amsterdam, 1988, Schapire, 1991] and specific results are available for neural networks [Haussler, 1989], which are the most widely used classification algorithms in secondary structure prediction. 1.5.2 Database mining Genetic algorithms In this thesis a genetic algorithm [Holland, 1975, Goldberg, 1989] is used to search the space of possible amino acid representations. As discussed in Chapter 3, ....
.... [Sun et al. 1991, Jones, 1992] but training a simple threshold neural network is NP complete [Blum and Rivest, 1992] Fast algorithms are known for finding good neural network topologies [Roy and Mukhopadhyay, 1992] but the number of instances needed to train them is typically large [Baum and Haussler, 1989] Clustering algorithm Clustering algorithms group objects in such a way that intragroup similarities are maximized while intergroup similarities are minimized. These groups partition the set of all objects so that previously unseen objects may be placed into a group. Thus, the result of running ....
Haussler, D. (1989). Generalizing the PAC model for neural net and other learning applications. Technical report, University of California, Santa Cruz, CA.
....used this more general framework. By using the quadratic loss function mentioned above in place of the discrete loss, Kearns and Shapire investigate the problem of efficiently learning a real valued regression function that gives the probability of a classification for each instance [26] In [17] it is shown how the VC dimension and related tools, originally developed by Vapnik, Chervonenkis, and others for this type of analysis, can be applied to the study of learning in neural networks. Here no restrictions whatsoever are placed on the joint probability distribution governing the ....
D. Haussler. Generalizing the PAC model for neural net and other learning applications. Information and Computation, 1990. to appear.
No context found.
Haussler, D. (1989). Generalizing the PAC model for neural net and other learning applications. Technical Report UCSC-CRL-89-30, University of California Santa Cruz.
No context found.
David Haussler. Generalizing the PAC model for neural net and other learning applications. Technical Report UCSC CRL-89-30, University of California at Santa Cruz, September 1989. Extended abstract appeared in the Proceedings of FOCS '89.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC