| Sollich, P. (1994). Query construction, entropy and generalization in neural network models. Physical Review E, 49, 4637--4651. |
....input signals. The problem of designing input signals for optimal generalization is called active learning (Cohn, Ghahramani, Jordan, 1996; Fukumizu, 1996; Vijayakumar Ogawa, 1999) It is also referred to as optimal experiments (Kiefer, 1959; Fedorov, 1972; Cohn, 1994) or query construction (Sollich, 1994). Reinforcement learning (Kaelbling, 1996) which has been extensively studied recently in the field of machine learning, can be regarded as another form of active learning. In mathematical statistics, an active learning criterion called the D optimal design has been thoroughly studied (Kiefer, ....
Sollich, P. (1994). Query construction, entropy and generalization in neural network models. Physical Review E, 49, 4637--4651.
....if we can actively design input signals. The problem of designing input signals for optimal generalization is called active learning (Cohn et al. 5] Fukumizu [8] Vijayakumar and Ogawa [23] It is also referred to as optimal experiments (Fedorov [7] Cohn [3] or query construction (Sollich [14]) Reinforcement learning (Kaelbling [10] which has been extensively studied recently in the field of machine learning, can be regarded as another form of active learning. Active learning can be classified into two categories depending on the optimality. One is the global optimal, where a set ....
Sollich, P. (1994). Query construction, entropy and generalization in neural network models. Physical Review E, 49, 4637--4651.
....in model parameter space. Second, we address the important sample complexity question, i.e. does the active strategy require fewer examples to learn the target to the same degree of uncertainty Our results are stated in PAC style (Valiant, 1984) After completion of this work, we learnt that Sollich (1994) had also recently developed a similar formulation to ours. His analysis is conducted in a statistical physics framework. The rest of the paper is organized as follows: Section 2, develops our active sampling paradigm. In Sections 3 and 4, we consider two classes of functions for which active ....
....the y values actually observed but only on the x values sampled. Thus if the learner is to collect n data points, it can pre compute the n points at which to sample from the start. In this sense the active algorithm is not really adaptive. This behavior has also been observed by MacKay (1992) and Sollich (1994). 2) Needless to say, the general framework from optimal design can be used for any function class within a Bayesian framework. We are currently investigating the possibility of developing active strategies for Radial Basis Function networks. While it is possible to compute exact expressions for ....
P. Sollich. (1994) Query Construction, Entropy, Generalization in Neural Network Models.
....and applications using optimal experiment design. A canonical description of the theory of OED is given in Fedorov [1972] MacKay [1992] showed that OED could be incorporated into a Bayesian framework for neural network data selection and described several interesting optimization criteria. Sollich [1994] considers the theoretical generalization performance of linear networks given greedy vs. globally optimal queries and varying assumptions on teacher distributions. Empirically, optimal experiment design techniques have been successful when used for system identification tasks. In these cases a ....
....remaining part of the trajectory. Experiments using this form of optimization did not demonstrate measurable improvement, in the average case, over the greedy method, so it appears that trajectory optimization may not be worth the additional computational expense, except in extreme situations (see Sollich [1994] for a theoretical comparison of greedy and globally optimal querying) 4 Experimental Results In this section, we describe two sets of experiments using optimal experiment design for error minimization. The first attempts to confirm that the gains predicted by optimal experiment design may ....
P. Sollich. (1994) Query construction, entropy and generalization in neural network models. To appear in Physical Review E.
.... We shall call the resulting queries minimum (student or teacher space) entropy (MSSE MTSE) queries ; their effect on generalization performance has recently been investigated for perfectly learnable problems, where student and teacher space are identical (Seung et al. 1992, Freund et al. 1993, Sollich, 1994), and was found to depend qualitatively on the structure of the teacher. For a linear perceptron, for example, one obtains a relative reduction in generalization error compared to learning from random examples which becomes insignificant as the number of training examples, p, tends to infinity. ....
....for weight decay = 0:01; 0:1; 1. SN = Gamma 1 2N ln det MN ; where we have omitted an unimportant constant which depends on the training temperature only. This entropy is minimized by choosing each new query along the direction corresponding to the minimal eigenvalue of the existing MN (Sollich, 1994). The expression for the resulting average generalization error is given by eq. 2) with G replaced by its analogue for MSSE queries (Sollich, 1994) GQ = Deltaff [ff] 1 1 Gamma Deltaff [ff] where [ff] is the greatest integer less than or equal to ff and Deltaff = ff Gamma ....
[Article contains additional citation context not shown here]
P Sollich (1994). Query construction, entropy, and generalization in neural network models. Phys. Rev. E, 49:4637--4651.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC