| P. Niyogi and F. Girosi. Generalization bounds for function approximation from scattered noisy data. Advances in Computational Mathematics, 10:51-- 80, 1999. |
....error decreases but the sample error increases. This means that there is an optimal complexity of the hypothesis space for a given number of training data. In the case of the regularization algorithm described in this paper this tradeoff corresponds to an optimum value for # as studied by [11, 35, 3]. In empirical work, the optimum value is often found through cross validation techniques [53] This tradeoff between approximation error and sample error is probably the most critical issue in determining good performance on a given problem. The class of regularization algorithms, such as ....
P. Niyogi and F. Girosi. Generalization bounds for function approximation from scattered noisy data. Advances in Computational Mathematics, 10:51-- 80, 1999.
....a large variety of results about the approximation by neural networks (c.f. 3, 7, 13, 21, 26, 29] and the references quoted there) the stability aspect has been largely neglected so far, only few authors give a rigourous treatment of regularization methods for the approximation problem (cf. [14, 15, 16, 33, 40]) In Section 2 we will show that network approximation in Sobolev spaces is equivalent to least squares collocation for a corresponding integral equation of the first kind. Based on the well known results about least squares collocation we will derive results about the convergence in the case of ....
....of gradients, which is necessary for the numerical solution of the minimization problem. We will give an analysis for arbitrary choice of the parameter t, obviously all approximation results hold for the optimal choice of t, too (which is the case so far mainly studied in the literature, cf. [3, 7, 13, 26, 33]) The stability results deduced in Section 4 cannot be applied to the case of optimized t in a simple way, since the dependence of t upon the data f ffi has to be examined additionally. An extension of the stability results to this nonlinear problem will be one of our main future projects. So ....
[Article contains additional citation context not shown here]
P.Niyogi, F.Girosi, Generalization bounds for function approximation from scattered noisy data, Adv. Comp. Math. 10 (1999), 51-80.
....Fonds zur Forderung der wissenschaftlichen Forschung under grant SFB F013 1308 1 where P is a compact subset of R p and OE is a given activation function. The above network architecture is frequently used for approximation problems because of its good approximation properties (cf. e.g. [5, 13, 15, 19]) especially in the case of Ridgeconstructions (cf. e.g. 1, 7, 14] where OE is of the form OE(x; a; b) oe(a T x b) a 2 A ae R d ; b 2 B ae R : 1.3) Hornik et al. 13] showed that the union of the sets X n defined in (1.2) with OE given by (1.3) are dense in C( Omega Gamma ( Omega ....
.... ( Omega ae R d ) if oe is a continuous function of sigmoidal form, i.e. oe is monotone and lim s Gamma1 oe(s) 0 ; lim s 1 oe(s) 1 : In subsequent papers, the approximation capabilities of several network constructions with linear output layers have been investigated (cf. e.g. [1, 15, 19] and the references therein) A result of particular interest is the dimension independent convergence rate inf fn2Xn kf Gamma f n k L 2( Omega Gamma = O(n Gamma 1 2 ) 1.4) which can be achieved under the additional conditions (cf. 19] sup t2P kOE( t)k L 2( Omega Gamma 1 and f ....
[Article contains additional citation context not shown here]
P. Niyogi and F. Girosi, Generalization bounds for function approximation from scattered noisy data, Adv. Comp. Math. 10 (1999), 51--80.
....of the range of the integral operator h 7 Z P h(t)OE( Delta; t) dt : Rates are usually only obtained under additional conditions on f (cf. e.g. 4] A natural condition seems to be that f is in the range of the above operator, i.e. f(x) Z P h(t)OE(x; t) dt (1. 5) It was shown in [6] that under this condition the rate inf g2Xn kf Gamma gk L 2 = O(n Gamma 1 2 ) 1.6) is obtained if OE is a continuous function. We improve this result under additional smoothness assumptions on the basis function OE in the next section with estimates also in H m ( Omega Gamma4 ....
....will give error bounds in W m;r that depend on the dimension p of P , where the analysis is based on finite element theory. In Section 3, we apply the results to perceptrons and give sufficient conditions on f for condition (1.5) to hold. 2. Error Bounds An inspection of the proof of (1. 6) in [6] shows that the result can be improved if the activation function OE is Holder continuous. Moreover, rates can be obtained in H m( Omega Gamma1 Theorem 2.1. Let X n be defined as in (1.4) with P ae R p bounded and OE such that kOE( Delta; t) Gamma OE( Delta; s)k H m( Omega Gamma ckt Gamma ....
P. Niyogi and F. Girosi, Generalization bounds for function approximation from scattered noisy data, Adv. Comp. Math. 10 (1999), 51--80.
....to imply that A is a good algorithm. However, it is easier to determine empirically whether an algorithm has good training error. Also, in general, generalization error bounds are more elusive than training error bounds. Remark 2.8 There are other ways of analyzing true error. Some authors [Niy98, NG99, CS02] write true error as the sum of approximation error , which is the error rate Err D (h ) of the optimal classi er h 2 H, and estimation error , which is Err D (f S ) Err D (h ) Approximation error measures how well H can t the data, and estimation error measures the gap between the ....
P. Niyogi and F. Girosi. Generalization bounds for function approximation from scattered noisy data. Advances in Computational Mathematics, 10(1):51-80, 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC