| David Cohn, Les Atlas, and Richard Ladner. Improving generalization with self-directed learning, 1992. To appear in Machine Learning. |
....high risk for heart disease In applications where instances are images or natural language texts, arbitrary membership queries are also implausible. Several algorithms have been proposed that base querying on filtering a stream of unlabeled instances rather than on creating artificial instances [6, 10, 20, 31]. The expert is asked to label only those instances whose class membership is sufficiently uncertain. Several definitions of uncertainty and sufficiency have been used, but all are based on esti 1. Obtain an initial classifier 2. While expert is willing to label instances (a) Apply the current ....
....to the threshold can be found; where the stream of instances is effectively infinite, one can choose instances whose scores are within some distance of the threshold. The cycle is described in Figure 1 for the finite case. Single classifier approaches to uncertainty sampling have been criticized [6, 20] on the grounds that one classifier is not representative of the set of all classifiers consistent with the labeled data: the version space [24] The degree to which this is a problem in practice has not been established. Single classifier approaches have successfully been used in generating ....
David Cohn, Les Atlas, and Richard Ladner. Improving generalization with self-directed learning, 1992. To appear in Machine Learning.
....label the subsample of b examples (d) Train a new classifier on all labeled examples Figure 1. An algorithm for uncertainty sampling with a single classifier. Recently, several algorithms for learning via queries have been proposed that filter existing examples rather than creating artificial ones [9, 10, 11]. These algorithms ask a teacher to label only those examples whose class membership is sufficiently uncertain . Several definitions of uncertainty have been used, but all are based on estimating how likely a classifier trained on previously labeled data would be to produce the correct class ....
....classification decisions, but estimate their certainty, the certainty estimate can be used to select examples. A single classifier approach to uncertainty sampling has several theoretical failings, including underestimation of true uncertainty, and biases caused by nonrepresentative classifiers [9, 10]. On the other hand, experiments using a single classifier to make arbitrary queries [14] or select subsets of labeled data [8, 15] have shown substantial speedups in learning. Relevance sampling, which has proven quite effective for text retrieval, also uses a single classifier. 3 An Uncertainty ....
D. Cohn, L. Atlas, and R. Ladner. Improving generalization with self-directed learning, 1992. To appear in Machine Learning.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC