See this document in CiteSeerX!

Learning to Classify Text from Labeled and Unlabeled Documents (1998)  (Make Corrections)  (80 citations)
Kamal Nigam
Proceedings of AAAI-98, 15th Conference of the American Association for Artificial Intelligence



  Home/Search   Context   Related

 
View or download:
cmu.edu/~mccallum/...mcataaai98s.ps.gz
umass.edu/~mccallu...mcataaai98s.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  cmu.edu/afs/cs/project/j...node30 (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This paper shows that the accuracy of learned text classifiers can be improved by augmenting small numbers of labeled training documents with a large pool of unlabeled documents. This is significant because in many important text classification problems obtaining classification labels is expensive, while large quantities of unlabeled documents are readily available. We present a theoretical argument showing that, under common assumptions, unlabeled data contain information about the target... (Update)

Cited by:   More
Concept Drift and the Importance of Examples - Klinkenberg, Rüping (2002)   (Correct)
The Maximum-Margin Approach to Learning Text Classifiers -.. - Joachims (2000)   (Correct)
A New Text Categorization Technique Using - Distributional Clustering And (2006)   (Correct)

Similar documents (at the sentence level):   More
47.3%:   Using EM to Classify Text from Labeled and Unlabeled Documents - Nigam (1998)   (Correct)
27.5%:   Learning to Classify Text from Labeled and Unlabeled.. - Nigam, McCallum, Thrun, .. (1998)   (Correct)
10.1%:   Text Classification from Labeled and Unlabeled.. - Nigam, McCallum.. (1999)   (Correct)

Active bibliography (related documents):   More   All
0.6:   Employing EM and Pool-Based Active Learning for Text Classification - McCallum (1998)   (Correct)
0.6:   Discovery of implicit and explicit connections between .. - Robert McArthur.. (2003)   (Correct)
0.4:   Automated Modeling and Nonlinear Axis Scaling - Leejay Wu (2005)   (Correct)

Similar documents based on text:   More   All
0.5:   A Parallel Learning Algorithm for Text Classification - Kruengkrai, Jaruskulchai (2002)   (Correct)
0.5:   Using Unlabeled Data to Improve Text Classification - Nigam (2001)   (Correct)
0.4:   Pool-Based Active Learning for Text Classification - Nigam, McCallum (1998)   (Correct)

Related documents from co-citation:   More   All
26:   Combining labeled and unlabeled data with co-training - Blum, Mitchell - 1998
23:   Text categorization with Support Vector Machines: Learning with many relevant fe.. - Joachims - 1998
20:   Machine Learning (context) - Mitchell - 1997

BibTeX entry:   (Update)

K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Learning to classify text from labeled and unlabeled documents. In Proceedings of the Fifteenth National Conference on Artificial Intelligence. AAAI Press, 1998. http://citeseer.ist.psu.edu/article/nigam98learning.html   More

@inproceedings{ nigam98learning,
    author = "Kamal Nigam and Andrew K. McCallum and Sebastian Thrun and Tom M. Mitchell",
    title = "Learning to classify text from labeled and unlabeled documents",
    booktitle = "Proceedings of {AAAI}-98, 15th Conference of the American Association for Artificial Intelligence",
    publisher = "AAAI Press, Menlo Park, US",
    address = "Madison, US",
    pages = "792--799",
    year = "1998",
    url = "citeseer.ist.psu.edu/article/nigam98learning.html" }
Citations (may not include all citations):
2528   Maximum likelihood from incomplete data via the EM (context) - Dempster, Laird et al. - 1977
240   In Advances in Neural Information Processing Systems (context) - Neural, Systems et al.
225   Newsweeder: Learning to filter netnews - Lang - 1995
149   Learning to extract symbolic knowledge from the World Wide W.. - Craven, Freitag et al. - 1998
116   Beyond independence: Conditions for the optimality of the si.. - Domingos, Pazzani - 1997
110   Context-sensitive learning methods for text categorization - Cohen, Singer - 1997
97   A comparison of two learning algorithms for text categorizat.. - Lewis, Ringuette - 1994
90   Bayesian classification (context) - Cheeseman, Stutz - 1996
81   Developments in automatic text retrieval (context) - Salton - 1991
76   Supervised learning from incomplete data via an EM approach - Ghahramani, Jordan - 1994
63   Threading electronic mail: A preliminary study - Lewis, Knowles - 1997
38   Active learning with committees for text categorization (context) - Tadepalli - 1997
34   A mixture of experts classifier with learning based on both .. (context) - Miller, Uyar - 1997
25   The effect of unlabeled samples in reducing the small sample.. (context) - Shahshahani, Landgrebe - 1994
10   An application of least squares fit mapping to text informat.. (context) - Yang, Chute - 1993
9   and Billsus (context) - Pazzani, Muramatsu - 1996
2   A sequential algorithm for training text classifiers (context) - Conference, Learning et al. - 1994
1   Text categorization with Support Vector Machines: Learning w.. (context) - Conference, Learning et al.
1   Improving retrieval performance by relevance feedback (context) - Hall, -- et al. - 1990
1   Estimations of dependences based on statistical data (context) - Trans, Remote et al. - 1982



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume8/fuernkranz98a-html/node30.html):   More
Conditions for Occam's Razor Applicability and Noise Elimination - Gamberger, Lavrac (1997)   (Correct)
Tractable Induction and Classification in First Order .. - Michèle.. (1997)   (Correct)
Query by Committee - Seung, Opper, Sompolinsky (1992)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC