10 citations found. Retrieving documents...
M. Anthony, P. Bartlett, Y. Ishai, and J. Shawe-Taylor. Valid generalisation from approximate interpolation. Submitted.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Using Vapnik-Chervonenkis Dimension to Analyze the Testing.. - Romanik, Vitter   (2 citations)  (Correct)

....on the number of times that the loop is executed (for example, when the input x is used as the upper bound on the loop index variable) then the construct is not randomly approximately testable. Our work also bears some similarity to the work by Anthony, Bartlett, Ishai and ShaweTaylor [ABIST94, ABIST95] on valid generalization from approximate interpolation. They 1 PAC stands for probably almost correct. say that a set of functions H (a hypothesis space) validly generalizes a set of functions C (a concept space) from approximate interpolation if for any proximity parameter, confidence ....

....from X into R if and only if the pseudo dimension of H is finite. Random approximate testability is similar to valid generalization from approximate interpolation where the proximity parameter used is zero. The difference is that in the work of Anthony, Bartlett, Ishai and Shawe Taylor [ABIST94, ABIST95] the error of a hypothesis is the probability that a randomly drawn input is not within the proximity parameter, rather than being the expected difference in value between the hypothesis and the target for a randomly drawn input. 2 Measuring Testing Complexity When we examine the testing ....

Martin Anthony, Peter Bartlett, Yuval Ishai, and John Shawe-Taylor. Valid Generalisation from Approximate Interpolation. Combinatorics, Probability and Computing, 1995. to appear.


Using Vapnik-Chervonenkis Dimension to Analyze the Testing.. - Romanik, Vitter   (2 citations)  (Correct)

....bound on the number of times that the loop is executed (for example, when the input x is used as the upper bound on the loop index variable) then the construct is not randomly approximately testable. Our work also bears some similarity to the work by Anthony, Bartlett, Ishai and ShaweTaylor [ABIST94, ABIST95] on valid generalization from approximate interpolation. They 1 PAC stands for probably almost correct. say that a set of functions H (a hypothesis space) validly generalizes a set of functions C (a concept space) from approximate interpolation if for any proximity parameter, ....

....functions from X into R if and only if the pseudo dimension of H is finite. Random approximate testability is similar to valid generalization from approximate interpolation where the proximity parameter used is zero. The difference is that in the work of Anthony, Bartlett, Ishai and Shawe Taylor [ABIST94, ABIST95] the error of a hypothesis is the probability that a randomly drawn input is not within the proximity parameter, rather than being the expected difference in value between the hypothesis and the target for a randomly drawn input. 2 Measuring Testing Complexity When we examine the ....

Martin Anthony, Peter Bartlett, Yuval Ishai, and John Shawe-Taylor. Valid Generalisation from Approximate Interpolation. Technical Report NC-TR-94011, Royal Holloway University of London, December 1994.


Function Learning from Interpolation (Extended Abstract) - Anthony, Bartlett (1995)   (Correct)

....Another reason for studying the problem of this paper is that it has implications for learning in the presence of malicious noise, in which the labels on the training sample can be any real numbers within j of the true value of the target. A related problem that in which fl = 0 was studied in [2] and matching necessary and sufficient conditions were given. Partial results were presented in [4] These conditions are demonstrably stricter than those for the present problem. 2 Definitions and the Main Result A number of ways of measuring the expressive power of a class H of functions ....

Anthony, M. , Bartlett, P.L. , Ishai, Y. , Shawe-Taylor, J. (1994). Valid generalisation from approximate interpolation. To appear, Combinatorics, Probability and Computing.


Probabilistic Analysis of Learning in Artificial Neural Networks: .. - Anthony (1994)   (10 citations)  Self-citation (Anthony Shawe-taylor)   (Correct)

....Theorem 14.3 that finite fl dimension for all fl is necessary Scale Sensitive Dimensions 58 for function learning with a good model. Indeed, it is not. However, any upper bound on the sample complexity of p concept learning a class H provides an upper bound for learning H with a good model. In [8, 19, 17], the following result is given. This ensures that if the hypothesis space has finite pseudo dimension and a learning algorithm interpolates well enough on the training sample then it is a probably approximately correct algorithm. Further, it shows that if there is some noise in the classification ....

.... m L (j; ffi; ffl) of order 1 ffl pdim(H) ln 1 ffl ln 1 ffi ; such that if m m L , the following holds: for any probability distribution on X and any t 2 C, with probability at least 1 Gamma ffi , fx 2 X : jh L (x) Gamma t(x)j jg) ffl: The result presented in [8] is more general than this. In that paper, it is shown that to obtain accurate bounds on the sample complexity function m L (and, indeed, bounds depending on j) it is more appropriate to use a scale sensitive dimension termed the band dimension, also used in [81] It is easy to show that the ....

M. Anthony, P. Bartlett, Y. Ishai, and J. Shawe-Taylor. Valid generalisation from approximate interpolation. Submitted.


Probabilistic `Generalization' of Functions and Dimension-based.. - Anthony (1999)   (1 citation)  Self-citation (Anthony)   (Correct)

....least 1 , a P m random sample z 2 Z m is such that er P (L(z) inf h2H er P (h) 4. 2 Generalization from interpolation Another approach to the generalization of real functions is to consider generalization from approximate interpolation , where we can develop two distinct models [3, 2]. This approach is less general than the loss functions approach, in that it extends De nition 2.2 rather than De nition 2.4. For these models of generalization, we do have a target real function t : X IR together with a probability measure on X, and the aim is to nd a good approximation to t ....

....1) That is, we wish to be sure that, with probability at least 1 , if h is an interpolant of t on the sample, in the sense that t(x i ) h(x i ) t(x i ) for i = 1; 2; m, then h(x) is within of t(x) on a set of measure at least 1 . Then we arrive at the following de nition [6, 3]. De nition 4.3 Let H be a set of functions from X to [0; 1] We say that H generalizes from approximate interpolation if for all ; 2 (0; 1) there is m 0 ( such that, for all probability measures on X and all t : X IR, if m m 0 ( then with m probability at least 1 , ....

[Article contains additional citation context not shown here]

Anthony, M., P. Bartlett, Y. Ishai, and J. Shawe-Taylor (1994). Valid generalisation from approximate interpolation. To appear, Combinatorics, Probability and Computing.


Interpolation and Learning in Artificial Neural Networks - Anthony (1996)   Self-citation (Anthony Shawe-taylor)   (Correct)

....input in the sequence and does not merely perform well on average . In this paper I discuss what conclusions may be drawn about how well such an h interpolates further on general inputs. I shall present relevant results of Anthony and Bartlett [2] and Anthony, Bartlett, Ishai and Shawe Taylor [3] and describe how these results apply in a neural networks context. 2 A General Formulation We think of there being a target function , t, which the neural network is being trained to learn by means of some learning algorithm. The information provided during training consists of a sequence of ....

....2 Omega and jh (x i ) Gamma t(x i )j j for 1 i m, then P (fx 2 X : jh (x) Gamma t(x)j j flg) ffl: We say that N approximates from interpolated examples if there is such an m 0 (j; fl; ffl; ffi) for all j; fl; ffl; ffi 0. The following stronger condition has also been studied [3]. Here, we require that a state which is an j interpolant of a randomly drawn sample be, with high probability, j close in value to the target function on almost all of the inputs. Definition 2 Let j; ffl; ffi 0. We say that the integer m 0 (j; ffl; ffi) is a sufficient sample length for the ....

[Article contains additional citation context not shown here]

M. Anthony, P. Bartlett, Y. Ishai, J. Shawe-Taylor, (1994). Valid generalisation from approximate interpolation. To appear, Combinatorics, Probability and Computing.


Function Learning from Interpolation - Anthony, Bartlett (1994)   (1 citation)  Self-citation (Anthony Bartlett Shawe-taylor)   (Correct)

....examples. Section 5 discusses the gap between these bounds, and describes the implications for learning with malicious noise. Section 6 describes the relationship between the problem of approximation from interpolated examples and a related problem (the case fl = 0 in Definition 1) studied in [3]. By way of motivation, we give upper and lower bounds on the number of examples necessary for approximation from interpolated examples with a particular function class, a class of Lipschitz continuous functions. Proposition 2 Suppose that k 0. Let H be the class of all k Lipschitz continuous ....

....noise (in this sense) then it can certainly learn in the presence of uniformly distributed random noise (as defined in [5] which implies fat H is finite ( 5] Theorem 3) That is, a function class H is learnable with malicious noise if and only if fat H is finite. 6 When fl meets 0 In [3], the related problem of valid generalization from approximate interpolation was studied. We say that H validly generalizes C from approximate interpolation if, for all j 0 and ffl; ffi 2 (0; 1) there is an m such that for all probability distributions P on X and all t 2 C,with P m ....

[Article contains additional citation context not shown here]

Anthony, M. , Bartlett, P.L. , Ishai, Y. , Shawe-Taylor, J. (1994). Valid generalisation from approximate interpolation. To appear, Combinatorics, Probability and Computing.


Fat-Shattering and the Learnability of Real-Valued Functions - Bartlett, Long, al. (1995)   (24 citations)  Self-citation (Bartlett)   (Correct)

....function class to that of learning f0; 1g valued functions. In addition to the aforementioned papers, other general results about learning real valued functions have been obtained. Haussler [15] gives sufficient conditions for agnostic learnability. Anthony, Bartlett, Ishai, and Shawe Taylor [4] give necessary and sufficient conditions that a function that approximately interpolates the target function is a good approximation to it (see also [5] and [3] Natarajan [20] considers the problem of learning a class of real valued functions in the presence of bounded observation noise, and ....

....approximately interpolates the target function is a good approximation to it (see also [5] and [3] Natarajan [20] considers the problem of learning a class of real valued functions in the presence of bounded observation noise, and presents sufficient conditions for learnability. Theorem 2 in [4] shows that these conditions are not necessary in our setting. Merhav and Feder [18] and Auer, Long, Maass, and Woeginger [6] study function learning in a worst case setting. In the next section, we define admissible noise distribution classes and the learning problems, and present the ....

M. Anthony, P. L. Bartlett, Y. Ishai and J. Shawe-Taylor, Valid generalisation from approximate interpolation, Combinatorics, Probability and Computing, 1994, (to appear).


Function Learning from Interpolation - Anthony, Bartlett (1995)   (1 citation)  Self-citation (Anthony Bartlett)   (Correct)

....examples. Section 5 discusses the gap between these bounds, and describes the implications for learning with malicious noise. Section 6 describes the relationship between the problem of approximation from interpolated examples and a related problem (the case fl = 0 in Definition 1) studied in [3]. By way of motivation, we give upper and lower bounds on the number of examples necessary for approximation from interpolated examples with a particular function class, a class of Lipschitz continuous functions. Proposition 2 Suppose that k 0. Let H be the class of all k Lipschitz continuous ....

....noise (in this sense) then it can certainly learn in the presence of uniformly distributed random noise (as defined in [5] which implies fat H is finite ( 5] Theorem 3) That is, a function class H is learnable with malicious noise if and only if fat H is finite. 6 When fl meets 0 In [3], the related problem of valid generalization from approximate interpolation was studied. We say that H validly generalizes C from approximate interpolation if, for all j 0 and ffl; ffi 2 (0; 1) there is an m such that for all probability distributions P on X and all t 2 C,with P m ....

[Article contains additional citation context not shown here]

Anthony, M. , Bartlett, P.L. , Ishai, Y. , Shawe-Taylor, J. (1994). Valid generalisation from approximate interpolation. To appear, Combinatorics, Probability and Computing.


Probabilistic Analysis of Learning in Artificial Neural Networks: .. - Anthony (1997)   (10 citations)  Self-citation (Anthony)   (Correct)

....that we cannot infer from Theorem 14.3 that finite fl dimension for all fl is necessary for function learning with a good model. Indeed, it is not. However, any upper bound on the sample complexity of p concept learning a class H provides an upper bound for learning H with a good model. In [9, 20, 18], the following result is given. This shows that if the hypothesis space has finite pseudodimension and a learning algorithm interpolates well enough on the training sample then it is a probably approximately correct algorithm. Further, it shows that if there is some noise in the classification of ....

.... mL (j; ffi; ffl) of order 1 ffl pdim(H) ln 1 ffl ln 1 ffi ; such that if m mL , the following holds: for any probability distribution on X and any t 2 C, with probability at least 1 Gamma ffi , fx 2 X : jh L (x) Gamma t(x)j jg) ffl: The result presented in [9] is more general than this. In that paper, it is shown that to obtain accurate bounds on the sample complexity function mL (and, indeed, bounds depending on j) it is more appropriate to use a scale sensitive dimension termed the band dimension, also used in [91] In [9] it is also shown that ....

[Article contains additional citation context not shown here]

M. Anthony, P. Bartlett, Y. Ishai, and J. Shawe-Taylor. Valid generalisation from approximate interpolation. Combinatorics, Probability and Computing, Volume 5 191--214 (1996).

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC