(Enter summary)
Abstract: The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was developed and tested on a newswire text categorization task. This method, which we call uncertainty sampling, reduced by as much as 500-fold the amount of training data that would have to be... (Update)
Cited by: More
Active Learning with Multiple Views - Muslea (2002)
(Correct)
Unsupervised Activity Recognition Using Automatically.. - Wyatt, Philipose.. (2005)
(Correct)
Learning to Extract Entities from Labeled and Unlabeled Text - Jones (2005)
(Correct)
Active bibliography (related documents): More All
1.2: Heterogeneous Uncertainty Sampling for Supervised Learning - Lewis, Catlett (1994)
(Correct)
0.4: Probabilistic Information Retrieval as Combination of.. - Fuhr, Pfeifer (1994)
(Correct)
0.3: An Integrated System for Filtering News and.. - Amati, D'Aloisi..
(Correct)
Similar documents based on text: More All
0.2: The Projective Geometry of the Gale Transform - Eisenbud, Popescu (1991)
(Correct)
0.2: Fax: An Alternative to SGML - Kenneth Church William
(Correct)
0.1: Finding Terminology Translations From Non-Parallel Corpora - Fung, McKeown (1997)
(Correct)
Related documents from co-citation: More All
36: Text categorization with Support Vector Machines: Learning with many relevant fe..
- Joachims - 1998
26: Relevance Feedback in Information Retrieval (context) - Rocchio - 1971
25: A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categoriza..
- Joachims - 1997
BibTeX entry: (Update)
Lewis, D. D. (1995). A sequential algorithm for training text classifiers: Corrigendum and additional data. SIGIR Forum, 29 (2), 13--19. http://citeseer.ist.psu.edu/lewis94sequential.html More
@inproceedings{ lewis94sequential,
author = "David D. Lewis and William A. Gale",
title = "A sequential algorithm for training text classifiers",
booktitle = "Proceedings of {SIGIR}-94, 17th {ACM} International Conference on Research and Development in Information Retrieval",
publisher = "Springer Verlag, Heidelberg, DE",
address = "Dublin, IE",
editor = "W. Bruce Croft and Cornelis J. van Rijsbergen",
pages = "3--12",
year = "1994",
url = "citeseer.ist.psu.edu/lewis94sequential.html" }
Citations (may not include all citations):
520
Generalized Linear Models (context) - McCullagh, Nelder - 1989
441
Queries and concept learning (context) - Angluin - 1988 ACM DBLP
274
Generalization as search (context) - Mitchell - 1982 DBLP
271
Improving retrieval performance by relevance feedback (context) - Salton, Buckley - 1990 ACM DBLP
153
Sampling Techniques (context) - Cochran - 1977 ACM
94
A method for disambiguating word senses in a large corpus (context) - Gale, Church et al. - 1993
83
Query by committee
- Seung, Opper et al. - 1992 ACM DBLP
75
The evidence framework applied to classification networks
- MacKay - 1992
73
An evaluation of phrasal and clustered representations on a .. (context) - Lewis - 1992 ACM DBLP
46
Selecting concise training sets from clean data (context) - Plutowski, White - 1993
39
IEEE Transactions on Information Theory (context) - Hart, nearest - 1968
37
Models for retrieval with probabilistic indexing (context) - Fuhr - 1989 ACM
32
Query-based learning applied to partially trained multilayer.. (context) - Hwang, Choi et al. - 1991
23
Automatic indexing: An experimental inquiry (context) - Maron - 1961 ACM DBLP
20
and query by committee (context) - Freund, Seung et al. - 1992
17
Some inconsistencies and misnomers in probabilistic informat.. (context) - Cooper - 1991 ACM
16
Probabilistic retrieval based on staged logistic regression (context) - Cooper, Gey et al. - 1992 ACM DBLP
7
Attentional focus training by boundary region data selection (context) - Davis, Hwang - 1992
7
Intelligent high-volume text processing using shallow (context) - Hayes - 1992
6
Improved training via incremental learning (context) - Utgoff - 1989 ACM DBLP
6
The automatic indexing system AIR/PHYS---from research to ap.. (context) - Biebricher, Fuhr et al. - 1988
6
Combining model-oriented and description-oriented approaches..
- Fuhr, Pfeifer
3
A brief history of sequential analysis (context) - Ghosh - 1991
2
Improving generalization with self-directed learning (context) - Cohn, Atlas et al. - 1992
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.cora.jprc.com/Information_Retrieval/Retrieval/index.html): More
Automated Classification Of Encounter Notes In A Computer Based.. - Aronow (1994)
(Correct)
Advantages of Query Biased Summaries in Information Retrieval - Tombros, Sanderson (1998)
(Correct)
The Hardware/Software Balancing Act for Information.. - Lu, McKinley, Cahoon
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC