| R. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. In Advances in Neural Information Processing Systems 2. MIT Press, 1990. |
.... received partial support from NSF grant CCR 9821087 Partially supported by DFG (JA 379 9 1, MU 987 1 1) and travel grants from EU (Neurocolt II) as positive (active) or negative (inactive) In Machine Learning this type of problem has been called query learning [Ang88] selective sampling [CAL90] or active learning [TK00] A Round0 data set contains 1,316 chemically diverse examples, only 39 of which are positive. A second Round1 data set has 634 examples with 150 positives . This data set is preselected on the basis of medicinal chemistry intuition. Note that our classification ....
D. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. Advances in Neural Information Processing Systems, 2, 1990.
....methods used to assist this process should be adaptable to these varying requirements. In this paper we discuss several selection strategies that are aimed at addressing some of these issues. Additionally, we use a rather new paradigm from Machine Learning theory called active learning [2, 3, 4, 5]. Unlike more conventional learning methods where the data (training set) used to derive the model remains static, we let our data set increment with each round. In each round the algorithms actively selects a batch of unlabeled compounds to be tested for activity. Once the results from this batch ....
L. Atlas, D. Cohn, R. Ladner, M.A. El-Sharkawi, R.J. Marks, M.E. Aggoune, and D.C. Park; Training connectionist networks with queries and selective sampling; In D. Touretzky, Eds., Adv. in Neural inf. proc. sys. 2, pages 566--573. Morgan-Kaufmann, 1990.
....point of view all examples (compounds) are initially unlabeled. In each iteration the learner selects a batch of un labeled examples for being labeled as positive (active) or negative (inactive) In Machine Learning this type of problem has been called query learning [Ang88] selective sampling [CAL90] or active learning [TK00] A Round0 data set contains 1,316 chemically diverse examples, only 39 of which are positive. A second Round1 data set has 634 examples with 150 positives. This data set is preselected on the basis of medicinal chemistry intuition. Note that our classification ....
D. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. Advances in Neural Information Processing Systems, 2:566--573, 1990.
....properties known in machine learning such as ffl net and support vector machines. Working out this connection is left for future work. Partially supported by project I403 001.06 95 of the German Israeli Foundation for Scientific Research [GIF] 1 introduction In the Active Learning paradigm [3] the learner is given access to a stream of unlabeled samples, drawn at random from a fixed and unknown distribution and for every sample the learner decides whether to query the teacher for the label. Complexity in this context is measured by the number of requests directed to the teacher along ....
D. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. Advanced in Neural Information Processing Systems 2, 1990.
....data generation mechanism can be viewed as interactive learning. On the other hand, the use of only available (or randonfiy generated) data, is passive learning. If properly done, the use of queries can reduce the cost of data drastically from the case where examples are generated at random [1]. 2 Inversion of Multilayer Perceptrons Mnltilayer perceptrons are feed forward neural networks, which have one or more layers of hidden neurons between the input and output layers. The system dynamics in the retrieving phase of an L layer neural net can be described by the following equations: ....
D. Cobh, L. E. Atlas, R. Ladner, R. J. Marks II, M. E1-Sharkawi, M. Aggoune, and D.C. Parks. Training connectionist networks with queries and selective sampling. In Advances in Neural Information Processing Systems, Denver, November 1989.
....Using Query by Committee, Linear Separation and Random Walks Shai Fine Ran Gilad Bachrach Eli Shamir Institute of Computer Science The Hebrew University, Jerusalem, Israel ffshai,ranb,shamirg cs.huji.ac.il Abstract. In the Active Learning paradigm [CAL90] the learner tries to minimize the number of labeled instances it uses in the process of learning. The reasoning comes from many real life problems where the teacher s activity is an expensive resource (e.g. text categorization, part of speech tagging) The Query By Committee (QBC) SOS92] is ....
....of concepts by means of query the target labels at random sample of instances, has generated many studies and experimental works in recent years. In this work we re examine the Query By Committee (QBC) algorithm, formulated and analyzed at [SOS92, FSST97] QBC is an active learning algorithm [CAL90] which incorporate a relevance test for a potential label. Having access to a stream of unlabeled instances, the relevance test lters out the instances for which it assigns a low value, trying to minimize the number of labels used while learning. The motivation comes from many real life problems ....
D. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. Advanced in Neural Information Processing Systems 2, 1990.
....for high dimensional data and it is not clear how to effectively apply this approachondatatypically seen in many real life applications. Active learning is a term coined to represent methods where the learning algorithm assumes some control over the subset of the input space used in the modeling [9, 10]. In this paper, active learning will mean learning from unlabeled data, where an oracle can be queried for labels of specific instances, with the goal of minimizing the number of oracle queries required. Active learning has been proposed in various forms [2, 10, 11, 12, 17, 23, 24, 27] We will ....
D. Cohn, L. Atlas, and R.Ladner. Training connectionist networks with queries and selective sampling. In Advances in Neural Information Processing Systems 2. Morgan Kaufmann, 1990.
....ask the user to go through all the documents and identify those relevant to his needs. Instead, we would like the retrieval system to select a small informative subset which will be classi ed by the user and deduce the rest by itself. This process was termed Selective Sampling by Cohn et al. [32]. The study of the simultaneous use of both kinds of observations in learning leads to questions of the relative value of labeled and unlabeled instances. This was addressed by Castelli and Cover [29, 30] who took an information theoretic standpoint, namely the Bayesian learning setting. Assume ....
....class of concepts by means of querying the target labels in a random sample of instances, has generated many studies and experimental works in recent years. In this work we re examine the Query By Committee (QBC) algorithm, formulated and analyzed in [104, 55] QBC is an active learning algorithm [32] which incorporate a relevance test for a potential label. Having access to a stream of unlabeled instances, the relevance test lters out the instances for which it assigns a low value, trying to minimize the number of labels used while learning. The motivation comes from many real life problems, ....
D. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. Advanced in Neural Information Processing Systems 2, 1990.
....un rseau cinquante fois le mme exemple est une perte de temps#: on rapprend toujours la mme chose#; au contraire, choisir cinquante exemplessignificatifs de la base d exemples est plus efficace. L apprentissage actif informatif permet donc d augmenter la rapidit de convergence [Allred Kelly 89] Atlas et al. 90] Strand Jones 92] Plutowski White 93] Cachin 94] En comparant l apprentissage actif avec l apprentissage passif, on obtient des courbes d apprentissage du type de celles reprsentes sur la figure III.4. actif passif Erreur Temps Figure III.4. Courbes d apprentissages dans le cas passif ....
....sont exposes dans [Cachin 94] Dans le cadre de l action directe sur l environnement, on parle gnralement d apprentissage par query (demande)#: on donne l environnement (parfois dnomm oracle dans ce contexte) un pattern d entre (un stimulus) et celui ci doit fournir la rponse. Dans [Atlas et al. 90] RayChaudhuri Hamey 95] et [Hwang et al. 91] une telle stratgie est utilise pour construire un ensemble d apprentissage de taille rduite. Dans 165 [Biehl Schwarze 92] chaque adaptation des poids, le nouvel exemple apprendre est dtermin par query . Diffrents critres utiliss pour ....
[Article contains additional citation context not shown here]
Lee Atlas, David Cohn, Richard Ladner, M. A. El-Sharkawi, R. J. Marks II, M. E. Aggoune & D. C. Park. Training Connectionist Networks with Queries and Sampling. In D. S. Touretzky (Ed.): Advances in Neural Information Processing Systems 2, p. 566-573, San Mateo: Morgan-Kaufmann, 1990.
.... back propagation neural networks Neural networks have been used in many applied settings, including such diverse tasks as signal processing [ Reilly et al. 1992 ] catalog merchandising [ Schwartz, 1992 ] medical screening [ Rutenberg, 1992, Weber, 1990 ] and municipal power grid security [ Atlas et al. 1990d ] Back propagation neural networks are an interesting method for classifier induction because they promise highly parallel (and hence very fast) realtime classification. They also provide a novel representational formalism, which leads to improved performance on some tasks, as demonstrated by ....
....because they promise highly parallel (and hence very fast) realtime classification. They also provide a novel representational formalism, which leads to improved performance on some tasks, as demonstrated by several recent empirical studies [ Fisher and McKusick, 1989, Mooney et al. 1989, Atlas et al. 1990c, Atlas et al. 1990d, Cole et al. 1990, Atlas et al. 1990a, Dietterich et al. 1990, Fahlman and Lebiere, 1990b, Pratt and Norton, 1990, Weiss and Kulikowski, 1991, Shavlik et al. 1991, Thrun et al. 1991 ] However, these same studies indicate that neural network learning is usually much ....
[Article contains additional citation context not shown here]
Les Atlas, David Cohn, and Richard Ladner. Training connectionist networks with queries and selective sampling. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems 2, pages 566--573. Morgan Kaufmann, San Mateo, CA, 1990.
....character images that had no natural meaning. The learning algorithm that is analyzed in this paper uses random unlabeled instances as queries and in this way avoids the problem encountered by Baum s algorithm. Our work is derived within the query filtering paradigm. In this paradigm, proposed by [Cohn et al. 1990], the learner is given access to a stream of inputs drawn at random from the input distribution. The 53 learner sees every input, but chooses whether or not to query the teacher for the label. Giving the learner easy access to unlabeled random examples is a very reasonable assumption in many ....
....whose labels are most informative. Initially, most examples will be informative for the learner, but as the process continues, the prediction capabilities of the learner improve, and it discards most of the examples as non informative, thus saving the human teacher a large amount of work. In [Cohn et al. 1990] there are several suggestions for query filters together with some empirical tests of their performance on simple problems. Seung et al. Seung et al. 1992] have suggested a filter called query by committee, and analytically calculated its performance for some perceptron type learning problems. ....
D. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. Advances in Neural Information Processing Systems, 2:566--573, 1990.
....IH magnitudes tend to lead to large positive or large negative I j for hidden units j. As described in Section 3.1.2 above, this tends to make hidden units have near binary (close to 0 or 1) activations. Many researchers (cf. Widrow and Winter, 1988] Hanson and Burr, 1987] Fahlman, 1990] [Cohn et al. 1990]) have used static versions of the graphical display technique described in this section. Unlike several other techniques, such as Hinton Diagrams, Hinton et al. 1984] and some sophisticated graphical interface packages (e.g. Wejchert and Tesauro, 1990] this method does not only illustrate ....
....for DBT. The DBT algorithm allows information about the target data to be merged with weights from the source classifier. Such a procedure is applicable in contexts other than transfer, such as in active learning schemes (where new training data is obtained by querying the environment, cf. [Cohn et al. 1990]) and in pedagogical training methods (where training data is segregated into stages in order to speed learning) DBT may also be used as a postprocessor to a method such as KBANN, to adapt a knowledge source to the training data prior to learning. In these contexts, DBT acts as a catalyst ....
D. Cohn, L. E. Atlas, R. Ladner, M. A. El-Sharkawi, R. J. Marks II, M. E. Aggoune, and D. C. Park. Training Connectionist Networks with Queries and Selective Sampling. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems 2, pages 566--573. Morgan Kaufmann, San Mateo, CA, 1990.
....is analyzed in this paper uses random unlabeled instances as queries and in this way may avoid the problem encountered by Baum s algorithm. In the work above, queries are explicitly constructed. In contrast, our work is derived within the query filtering paradigm. In this paradigm, proposed by [CAL90], the learner is given access to a stream of inputs drawn at random from the input distribution. The learner sees every input, but chooses whether or not to query the teacher for the label. Giving the learner easy access to unlabeled random examples is a very reasonable assumption in many ....
....whose labels are most informative. Initially, most examples will be informative for the learner, but as the process continues, the prediction capabilities of the learner improve, and it discards most of the examples as non informative, thus saving the human teacher a large amount of work. In [CAL90] there are several suggestions for query filters together with some empirical tests of their performance on simple problems. Seung et al. SOS92] have suggested a filter called query by committee, and analytically calculated its performance for some perceptron type learning problems. For these ....
David Cohn, Les Atlas, and Richard Ladner. Training connectionist networks with queries and selective sampling. In D. Touretzky, editor, Advances in Neural Information Processing Systems 2, San Mateo, CA, 1990. Morgan Kaufmann.
....properties then the random distribution. For other applications, however, it might be diOEcult to nd a meaningful distance measure. A more adept approach to select a suitable training set would be to adapt the training set by dynamically selecting training patterns while training proceeds. Atlas, Cohn and Ladner [1990] have proposed an algorithm which, by investigation of the network state, decides which patterns are to be added to the training set. Although their algorithm shows better results then random selection of training sets, it requires expensive computations and is therefore diOEcult to use in ....
L. Atlas, D. Cohn, and R. Ladner. Training connectionist networks with queries and selective sampling. In D. Touretzky, editor, Advances in Neural Information Processing Systems 2 (NIPS 89), pages 566573. Morgan Kaufmann Pub., 1990.
....j is scaled by a per pattern factor f; f 2 [0; 1] determined by fuzzy inference. When using large training sets, the result of the inference can be used to decide which patterns need to be learnt in the current epoch, thereby turning the parameter adaptation into a pattern selection strategy [13, 2]. In this case, the factor f p would be interpreted as the propability for selecting pattern p for learning in the next training epoch. Table 1 can be directly transformed into the fuzzy rulebase. Two fuzzy variables, degree of difficulty (D) and error (E) will serve as antecedents. Their ....
L. Atlas, D. Cohn, and R. Ladner. Training connectionist networks with queries and selective sampling. In D. S. Touretzky, editor, NIPS, pages 566--573. Morgan Kaufman, 1989.
....by a per pattern factor f; f 2 [0; 1] determined by fuzzy inference. When using large training sets, the result of the inference can be used to decide which patterns need to be learnt in the current epoch, thereby turning the parameter adaptation into a pattern selection strategy [ Reine, 1995; Atlas et al. 1989 ] In this case, the factor f p would be interpreted as the propability for selecting pattern p for learning in the next training epoch. Table 1 can be directly transformed into the fuzzy rulebase. Two fuzzy variables, degree of difficulty (D) and error (E) will serve as antecedents. The ....
L. Atlas, D. Cohn, and R. Ladner. Training connectionist networks with queries and selective sampling. In D. S. Touretzky, editor, NIPS, pages 566--573. Morgan Kaufman, 1989.
....this model is that collecting and labelling data is costly. The information captured by a corpus as a function of its size flattens out quickly. For particular problems, one can show that the expected information gain of a pattern s label diminishes exponentially[13] Work in query based methods[3, 2, 13] has dealt with the issue of minimizing labeling cost, but it ignores the cost of waiting for the informative patterns. The on site model is suggested on the reasonable assumption that patterns that are low on information with respect to the classifier are, in fact, patterns that can be classified ....
....1,2. 6, 9] The QBC network classified the patterns that the committee agreed upon and queried when there was disagreement. Each time an additional 20 new patterns had been queried, the committee was retrained on the complete set of queried patterns. This is similar to the method described in [3], yet differs in that we use the current networks as the new initial points. In this work we did not concern ourselves with the computational requirements of the methods. Comm100 and Comm1000 were trained for 500,000K BP iterations and cross validated with 10,000 independent patterns to choose the ....
D. Cohen, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. In Touretzky, editor, Advances in Neural Information Processing Systems, volume 2. Morgan Kaufman, 1990.
....1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0.1 0.1 0.5 Figure 4: Output activation contour and corresponding hidden units for a Gaussian distribution problem. Many researchers (cf. Widrow and Winter, 1988 ] Hanson and Burr, 1990 ] Fahlman and Lebiere, 1990a ] Atlas et al. 1990 ] have used static versions of the graphical display technique described in this section. Unlike several other methods, such as Hinton Diagrams [ Hinton et al. 1984 ] and some sophisticated network display packages (e.g. Wejchert and Tesauro, 1990 ] this method not only illustrates the ....
....for DBT. The DBT algorithm allows information about the target data to be merged with weights from the source classifier. Such a procedure is applicable in contexts other than transfer, such as in active learning schemes (where new training data is obtained by querying the environment, cf. Atlas et al. 1990 ] and in pedagogical training methods (where training data is segregated into stages in order to speed learning) DBT may also be used as a postprocessor to a method such as KBANN, to adapt a knowledge source to the training data prior to learning. In these contexts, DBT acts as a catalyst ....
Les Atlas, David Cohn, and Richard Ladner. Training connectionist networks with queries and selective sampling. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems 2, pages 566--573. Morgan Kaufmann, San Mateo, CA, 1990.
....with is howtochoose which x to try next. There are many heuristics for choosing x, including choosing places where we don t have data #Whitehead, 1991#, where we perform poorly #Linden Weber, 1993#, where wehave low con#dence #Thrun M#oller, 1992#, where we expect it to change our model #Cohn, Atlas, Ladner, 1990, 1994#, and where we previously found data that resulted in learning #Schmidhuber Storck, 1993#. In this paper we will consider how one may select x in a statistically #optimal manner for some classes of machine learning algorithms. We #rst brie#y review how the statistical approach can be ....
Cohn, D., Atlas, L., & Ladner, R. #1990#. Training connectionist networks with queries and selective sampling. In Touretzky, D. #Ed.#, Advances in Neural Information Processing Systems 2. Morgan Kaufmann.
.... a world model, the learner is trying to build a mapping, e.g. from joint angles to cartesian coordinates (or from state action pairs to next states) If it is allowed to select arbitrary joint angles (inputs) in successive time steps, then the problem is one of selecting the next query to make ([Cohn, 1990], Baum and Lang, 1991] In exploration, however, one s choices for a next input are constrained by the current input. We cannot instantaneously teleport to remote parts of the state space, but must choose among inputs that are available in the next time step. One approach to selecting a next ....
D. Cohn, L. Atlas and R. Ladner. (1990) Training connectionist networks with queries and selective sampling. In D. Touretzky, editor, Advances in Neural Information Processing Systems 2, Morgan Kaufmann, San Francisco.
No context found.
R. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. In Advances in Neural Information Processing Systems 2. MIT Press, 1990.
No context found.
D. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. In D. Touretzky, editor, Advances in Neural Information Processing Systems 2, San Mateo, CA, 1990. Morgan Kaufmann. 1616 R. Collobert and S. Bengio. SVMTorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research, 1:143--160, 2001.
No context found.
Les Atlas, David Cohn, and Richard Ladner. Training connectionist networks with queries and selective sampling. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems 2, pages 566--573. Morgan Kaufmann, San Mateo, CA, 1990.
No context found.
David Cohn, Les Atlas, and Richard Ladner. Training connectionist networks with queries and selective sampling. In D. Touretzky, editor, Advances in Neural Information Processing Systems 2, San Mateo, CA, 1990. Morgan Kaufmann. 168 Y. FREUND, H.S. SEUNG, E. SHAMIR AND N. TISHBY
No context found.
D. Cohn, L. Atlas, and L. Ladner. Training connectionist networks with queries and selective sampling. In D. Touretzky, editor, Advances in Neural Information Processing Systems 2. Morgan Kaufmann, San Mateo, California, 1990.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC