| Y Freund, H S Seung, E Shamir, and N Tishby. Information, prediction, and query by committee. In S J Hanson, J D Cowan, and C Lee Giles, editors, Advances in Neural Information Processing Systems 5, pages 483--490, San Mateo, CA, 1993. Morgan Kaufmann. |
....to determine the degree of uncertainty and the second one to do the classification. In this work, a probabilistic classifier was chosen for the first task based on efficiency considerations and C4.5 rule induction was chosen for the second task. Another related approachiscalledQuery by Committee [27, 16]. In one version of the query by committee approach two classifiers consistent with the already labeled training data are randomly chosen. Instances of the data for which the twochosen classifiers disagree are then candidates for labeling. The emphasis here has been to prove theoretical results ....
Y. Freund, H. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Information Processing Systems 5, pages 337--344. Morgan Kaufmann, 1992.
....limited to a single learning algorithm. Kong and Dieterrich [13] point out that combining heterogeneous learning algorithms can reduce bias as well as variance if the bias errors of the various algorithms are different. Krogh and Vedelsby [14] have developed a method known as query by committee [30, 6, 6]. In this approach, as a collection of neural networks is trained simultaneously, patterns which have large ambiguity (i.e. the ensemble s predictions tend to vary considerably) are more likely to be included in the next round of training. 7.3. Non constant Weighting Functions Some combining ....
Yoav Freund, H. Sebastian Seung, Eli Shamir, and Naftali Tishby. Information, prediction, and query by committee. In Stephen Jos'e Hanson, Jack D. Cowan, and C. Lee Giles, editors, Advances in Neural Information Processing Systems, volume 5, pages 483--490. Morgan Kaufmann, San Mateo, CA, 1993.
....about the target function it is supposed to learn. We study an approach to active learning that is based on principles of information theory and which selects a query that yields the maximum expected information gain. In particular, we consider a query algorithm called Query by Committee (QBC) [2, 1] which implements an approximation of this optimal query strategy in a convenient way. QBC employs an ensemble or committee of learners to estimate the information gain of a query. A query is selected in such a way that the committee s estimation of the expected information gain that can be ....
Freund, Y., Seung, H. S., Shamir, E., and Tishby, N.: Information, prediction, and query by committee. In Neural Information Processing Systems 5, ed. by S. J. Hanson, J. D. Cowan, and C. L. Giles (Morgan Kaufmann, San Mateo, CA 1993) pp. 483--490
....Approach As already mentioned data gathering is expensive, but computation in today s world is fast and cheap. Based upon this philosophical premise we suggest a means of achieving both the above goals of active learning simultaneously [9, 8, 10] We have used the query by committee approach [3, 11]. If several models are fitted to random subsamples of a small amount of initially collected data, they will probably disagree in the first instance. If we add minimal additional data according to some defined criterion to our sample and repeat the process of having several models examine random ....
Y. Freund, H.S. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Information Processing Systems, volume 5. Morgan Kaufmann, San Mateo, California, 1993.
....accepted at high level vision tasks such as appearance learning and object matching [12, 13] 2.3 Active Learning The main paradigm used for active learning is uncertainty based sampling. The idea is to sample randomly at regions of high uncertainty [4] or to choose the most uncertain point [10, 7, 9]. These approaches are based on the intuition that sampling in regions of high uncertainty will yield a better classifier. Uncertainty may be provided by the classification algorithm itself [10] or be defined as a value of disagreement between the committee members [7, 9] Another approach is a ....
....most uncertain point [10, 7, 9] These approaches are based on the intuition that sampling in regions of high uncertainty will yield a better classifier. Uncertainty may be provided by the classification algorithm itself [10] or be defined as a value of disagreement between the committee members [7, 9]. Another approach is a lookahead scheme [5] where the effect of labeling of each example on the classifier performance is estimated and example that maximizes the expected accuracy of the resulting classifier is chosen. 3 Framework Let S be the algorithm for selective sampling and F be the set ....
Yoav Freund, H. Sebastian Seung, Eli Shamir, and Naftali Tishbi. Information, prediction, and query by committee. In S. Hanson et al., editor, Advances in Neural Information Processing, volume 5, pages 483--490. Morgan Kaufmann, 1993.
....high risk for heart disease In applications where instances are images or natural language texts, arbitrary membership queries are also implausible. Several algorithms have been proposed that base querying on filtering a stream of unlabeled instances rather than on creating artificial instances [6, 10, 20, 31]. The expert is asked to label only those instances whose class membership is sufficiently uncertain. Several definitions of uncertainty and sufficiency have been used, but all are based on esti 1. Obtain an initial classifier 2. While expert is willing to label instances (a) Apply the current ....
Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Information Processing Systems 5, San Mateo, CA, 1992. Morgan Kaufmann.
....or the teacher. We shall call the resulting queries minimum (student or teacher space) entropy (MSSE MTSE) queries ; their effect on generalization performance has recently been investigated for perfectly learnable problems, where student and teacher space are identical (Seung et al. 1992, Freund et al. 1993, Sollich, 1994) and was found to depend qualitatively on the structure of the teacher. For a linear perceptron, for example, one obtains a relative reduction in generalization error compared to learning from random examples which becomes insignificant as the number of training examples, p, tends ....
.... Q p =1 Theta i 1 p N y w T V x j : It can easily be verified that this entropy is minimized 1 by choosing queries x which bisect the existing version space, i.e. for which the hyperplane perpendicular to x splits the version space into two equal halves (Seung et al. 1992, Freund et al. 1993). Such queries lead to an exponentially shrinking version space, V (p) 2 Gammap , and hence a linear decrease of the entropy, S V = Gammaff ln 2. We consider instead queries which achieve qualitatively the same effect, but permit a much simpler analysis of the resulting student performance. ....
Y Freund, H S Seung, E Shamir, and N Tishby (1993). Information, prediction, and query by committee. In S J Hanson, J D Cowan, and C Lee Giles, editors, NIPS 5, pages 483--490, San Mateo, CA, Morgan Kaufmann.
....with the labeled training data [12] An infinite stream of unlabeled data is assumed, from which QBC asks the teacher for class labels only on those examples for which the two chosen classifiers disagree. Freund, Seung, Shamir, and Tishby extend the QBC result to a wide range of classifier forms [13]. They prove that, under certain assumptions, the number of queries made after examining m random examples will be logarithmic in m, while generalization error will decrease almost as quickly as it would if queries were made on all examples. More precisely, generalization error decreases as ....
Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Informations Processing Systems 5, San Mateo, CA, 1992. Morgan Kaufmann.
....The use of ensembles is not a new idea. It has been proposed for better generalization [6, 11] and has been shown to converge to the Bayes estimate in perceptron learning [10] Our use of ensembles of learners is a modification of the Query ByCommittee (QBC) algorithm, proposed by Seung et al. [13, 5] for approximating the version space in a query filtering problem. The algorithm Figure 1: The concept 252a. The domain of the concept is [ Gamma1; 1] 2 . Grey areas are labeled , white areas are labeled Gamma. Samples for training and testing were drawn uniformly from the domain. maintains ....
Y. Freund, H.S. Seung, E Shamir, and N. Tishby. Information, prediction, and query by committee. In Hanson, Cowan, and Giles, editors, Advances in Neural Information Processing Systems, volume 5. Morgan Kaufman, 1993.
....maximal information about the function. Methods where the learner points out good examples are often called active learning. We propose a query based active learning scheme that applies to ensembles of networks with continuous valued output. It is essentially a generalization of query by committee [6, 7] that was developed for classification problems. Our basic assumption is that those patterns in the input space yielding the largest error are those points we would benefit the most from including in the training set. Since the generalization error is always non negative, we see from (6) that the ....
Y. Freund, H.S. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Information Processing Systems, volume 5, San Mateo, California, 1993. Morgan Kaufmann.
....defined by MacKay [12] Entropy based investigations within the query filtering scenario have also been examined by Sollich [24, 23] A different approach was used by Paass and Kindermann [15] who designed their queries to minimise a Bayesian risk function. Seung and Freund s query bycommittee [21, 7] is a query filter that uses Cohn s general notion of selective sampling from a region of uncertainty [2] It leads to building effective querying algorithms that have been applied to different problem scenarios by Krogh and 2 QUERY BY COMMITTEE AND THE STATISTICAL JACK KNIFE 2 Vedelsby [11] by ....
.... the Statistical JackKnife The original formulation of the query by committee algorithm of Seung and Freund was simply If a committee of hypotheses are in disagreement over the label of a point then it becomes necessary to query the environment for its label and to add it to the training set [7]. The level of disagreement between a committee of hypotheses is therefore the criterion for querying. In Krogh and Vedelsby s paper [11] and our earlier work [16, 17, 18] the statistical variance function of the different hypotheses propounded by a committee of feedforward neural networks, is ....
Y. Freund, H.S. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Information Processing Systems, volume 5. Morgan Kaufmann, San Mateo, California, 1993.
.... and can be easily implemented with existing search engines, we suspect that other strategies for collecting examples may be competitive or superior; for instance, promising results have been obtained with uncertainty sampling (Lewis and Gale 1994) and query by committee (Seung et al. 1992; Freund et al. 1992; Dagan and Engelson 1995) A few final details require some discussion. Constraining the initial query: To construct the first query, a large set of documents were used as default negative examples. A default negative example is treated as a ordinary negative example unless it has already been ....
Y. Freund, H.S. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Informations Processing Systems 5, pages 483--490, San Mateo, CA, 1992. Morgan Kaufmann.
....network. Their algorithm makes queries that allow the network to efficiently determine the connection weights from the input layer to the hidden layer. Seung et al. 1992) independently proposed a similar scheme for selecting queries, basing it on a lack of consensus in a committee of learners. Freund et al. 1993) showed that as the size of the committee increases beyond the two learners used in selective sampling, the accuracy of one s utility estimate increases sharply. Work by David MacKay (1992) pursues a related approach to data selection using Bayesian analysis. By assigning prior probabilities to ....
Y. Freund, H. S. Seung, E. Shamir and N. Tishby. (1993) Information, prediction, and query by committee.
.... to be gaining ground and one of the areas where its application has achieved much success is the control of static and mobile robots [52, 71] There are currently limitations in the generalisation abilities of neural networks and work is being carried out to develop more efficient learning methods [9, 23, 24, 34, 36, 45, 46, 49, 50, 61, 64, 67]. Some researchers are looking at ensembles of neural networks in order to achieve better generalisation [26, 73] A fair prediction regarding intelligent controllers of the next century would be that they will be able to autonomously improve their performance on line [10, 69] and to plan while ....
Y. Freund, H.S. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Information Processing Systems, volume 5. Morgan Kaufmann, San Mateo, California, 1993.
....and measurement of data. This is expensive and therefore needs to be reduced as much as possible. Our proposed algorithm [8] suggests a means of achieving both the above objectives simultaneously. 2. An Algorithm to Minimise Data Collection We have used the query by committee approach [3, 9]. If several models are fitted to random subsamples of a small amount of initially collected data, they will probably disagree in the first instance. If we add minimal additional data according to some defined criterion to our sample and repeat the process of having several models examine random ....
Y. Freund, H.S. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Information Processing Systems, volume 5. Morgan Kaufmann, San Mateo, California, 1993.
....at each stage of training. This way, it is possible to avoid the redundancy of annotating many examples that contribute roughly the same information to the learner. The machine learning literature suggests several different approaches for selective sampling (Seung, Opper, Sompolinsky 1992; Freund et al. 1993; Cohn, Atlas, Ladner 1994; Lewis Catlett 1994; Lewis Gale 1994) In the first part of the paper, we analyze the different issues that need to be addressed when constructing a selective sampling algorithm. These include measuring the utility of an example for the learner, the number of ....
....at any given step. The alternative to measuring the informativeness of an example explicitly is to measure it implicitly, by quantifying the amount of uncertainty in the classification of the example, given the current training data. The committee based paradigm (Seung, Opper, Sompolinsky 1992; Freund et al. 1993) does this, for example, by measuring the disagreement among committee members on a classification. The main advantage of the implicit approach is its generality, as there is no need for complicated model specific derivations of expected information gain. The informativeness, or utility, of a ....
[Article contains additional citation context not shown here]
Freund, Y.; Seung, H. S.; Shamir, E.; and Tishby, N. 1993. Information, prediction, and query by committee.
No context found.
Y Freund, H S Seung, E Shamir, and N Tishby. Information, prediction, and query by committee. In S J Hanson, J D Cowan, and C Lee Giles, editors, Advances in Neural Information Processing Systems 5, pages 483--490, San Mateo, CA, 1993. Morgan Kaufmann.
No context found.
Freund, Y., Seung, H. S., Shamir, E., and Tishby, N.: Information, prediction, and query by committee. In Neural Information Processing Systems 5, ed. by S. J. Hanson, J. D. Cowan, and C. L. Giles (Morgan Kaufmann, San Mateo, CA 1993) pp. 483--490
No context found.
Y. Freund, H. S. Seung, E. Shamir, and N. Tishby, Information, prediction, and query by committee, in Advances in Neural Information Processing Systems 5, S. J. Hanson, J. D. Cowan, and C. Lee Giles eds., Morgan Kaufmann, San Mateo, CA, 1993, pp.483--490.
No context found.
Y. Freund, H.S. Seung, E. Shamir, and N. Tishby. Information, prediction, and query by committee. In Advances in Neural Information Processing Systems, volume 5. Morgan Kaufmann, San Mateo, California, 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC