(Enter summary)
Abstract: This paper shows that the accuracy of learned text
classifiers can be improved by augmenting small numbers
of labeled training documents with a large pool
of unlabeled documents. This is significant because in
many important text classification problems obtaining
classification labels is expensive, while large quantities
of unlabeled documents are readily available. We
present a theoretical argument showing that, under
common assumptions, unlabeled data contain information
about the target... (Update)
Cited by: More
Concept Drift and the Importance of Examples - Klinkenberg, Rüping (2002)
(Correct)
The Maximum-Margin Approach to Learning Text Classifiers -.. - Joachims (2000)
(Correct)
A New Text Categorization Technique Using - Distributional Clustering And (2006)
(Correct)
Similar documents (at the sentence level): More
47.3%: Using EM to Classify Text from Labeled and Unlabeled Documents - Nigam (1998)
(Correct)
27.5%: Learning to Classify Text from Labeled and Unlabeled.. - Nigam, McCallum, Thrun, .. (1998)
(Correct)
10.1%: Text Classification from Labeled and Unlabeled.. - Nigam, McCallum.. (1999)
(Correct)
Active bibliography (related documents): More All
0.6: Employing EM and Pool-Based Active Learning for Text Classification - McCallum (1998)
(Correct)
0.6: Discovery of implicit and explicit connections between .. - Robert McArthur.. (2003)
(Correct)
0.4: Automated Modeling and Nonlinear Axis Scaling - Leejay Wu (2005)
(Correct)
Similar documents based on text: More All
0.5: A Parallel Learning Algorithm for Text Classification - Kruengkrai, Jaruskulchai (2002)
(Correct)
0.5: Using Unlabeled Data to Improve Text Classification - Nigam (2001)
(Correct)
0.4: Pool-Based Active Learning for Text Classification - Nigam, McCallum (1998)
(Correct)
Related documents from co-citation: More All
26: Combining labeled and unlabeled data with co-training
- Blum, Mitchell - 1998
23: Text categorization with Support Vector Machines: Learning with many relevant fe..
- Joachims - 1998
20: Machine Learning (context) - Mitchell - 1997
BibTeX entry: (Update)
K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Learning to classify text from labeled and unlabeled documents. In Proceedings of the Fifteenth National Conference on Artificial Intelligence. AAAI Press, 1998. http://citeseer.ist.psu.edu/article/nigam98learning.html More
@inproceedings{ nigam98learning,
author = "Kamal Nigam and Andrew K. McCallum and Sebastian Thrun and Tom M. Mitchell",
title = "Learning to classify text from labeled and unlabeled documents",
booktitle = "Proceedings of {AAAI}-98, 15th Conference of the American Association for Artificial Intelligence",
publisher = "AAAI Press, Menlo Park, US",
address = "Madison, US",
pages = "792--799",
year = "1998",
url = "citeseer.ist.psu.edu/article/nigam98learning.html" }
Citations (may not include all citations):
2528
Maximum likelihood from incomplete data via the EM (context) - Dempster, Laird et al. - 1977
240
In Advances in Neural Information Processing Systems (context) - Neural, Systems et al.
225
Newsweeder: Learning to filter netnews
- Lang - 1995
149
Learning to extract symbolic knowledge from the World Wide W..
- Craven, Freitag et al. - 1998
116
Beyond independence: Conditions for the optimality of the si..
- Domingos, Pazzani - 1997
110
Context-sensitive learning methods for text categorization
- Cohen, Singer - 1997
97
A comparison of two learning algorithms for text categorizat..
- Lewis, Ringuette - 1994
90
Bayesian classification (context) - Cheeseman, Stutz - 1996
81
Developments in automatic text retrieval (context) - Salton - 1991
76
Supervised learning from incomplete data via an EM approach
- Ghahramani, Jordan - 1994
63
Threading electronic mail: A preliminary study
- Lewis, Knowles - 1997
38
Active learning with committees for text categorization (context) - Tadepalli - 1997
34
A mixture of experts classifier with learning based on both .. (context) - Miller, Uyar - 1997
25
The effect of unlabeled samples in reducing the small sample.. (context) - Shahshahani, Landgrebe - 1994
10
An application of least squares fit mapping to text informat.. (context) - Yang, Chute - 1993
9
and Billsus (context) - Pazzani, Muramatsu - 1996
2
A sequential algorithm for training text classifiers (context) - Conference, Learning et al. - 1994
1
Text categorization with Support Vector Machines: Learning w.. (context) - Conference, Learning et al.
1
Improving retrieval performance by relevance feedback (context) - Hall, -- et al. - 1990
1
Estimations of dependences based on statistical data (context) - Trans, Remote et al. - 1982
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume8/fuernkranz98a-html/node30.html): More
Conditions for Occam's Razor Applicability and Noise Elimination - Gamberger, Lavrac (1997)
(Correct)
Tractable Induction and Classification in First Order .. - Michèle.. (1997)
(Correct)
Query by Committee - Seung, Opper, Sompolinsky (1992)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC