See this document in CiteSeerX!

A Comparison of Event Models for Naive Bayes Text Classification (1998)  (Make Corrections)  (140 citations)
Andrew McCallum, Kamal Nigam



  Home/Search   Context   Related

 
View or download:
cmu.edu/People/mcc...inomialaaai98w.ps
cmu.edu/~mccallum/...inomialaaai98w.ps
cmu.edu/~knigam/pa...ialaaaiws98.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  cmu.edu/People/mccallum/ (more)
From:  cmu.edu/~knigam/resume
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Recent work in text classification has used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larkey and Croft 1996; Koller and Sahami 1997). Others use a multinomial model, that is, a uni-gram language model with integer word counts (e.g. Lewis and Gale 1994; Mitchell 1997). This paper aims to... (Update)

Related documents from co-citation:   More   All
35:   Text categorization with Support Vector Machines: Learning with many relevant fe.. - Joachims - 1998
29:   at forty: The independence assumption in information retrieval (context) - Lewis - 1998
28:   Machine Learning (context) - Mitchell - 1997

BibTeX entry:   (Update)

A. McCallum and K. Nigam. A comparison of event models for Naive Bayes text classification. In AAAI-98 Workshop on Learning for Text Categorization, 1998. http://citeseer.ist.psu.edu/article/mccallum98comparison.html   More

@misc{ mccallum98comparison,
  author = "A. McCallum and K. Nigam",
  title = "A comparison of event models for Naive Bayes text classification",
  text = "A. McCallum and K. Nigam. A comparison of event models for Naive Bayes
    text classification. In AAAI-98 Workshop on Learning for Text Categorization,
    1998.",
  year = "1998",
  url = "citeseer.ist.psu.edu/article/mccallum98comparison.html" }
Citations (may not include all citations):
2319   Elements of Information Theory (context) - Cover, Thomas - 1991
976   Machine Learning (context) - Mitchell - 1997
376   Text categorization with Support Vector Machines: Learning w.. - Joachims - 1998
201   Relevance weighting of search terms (context) - Robertson, Sparck-Jones - 1976
149   Learning to extract symbolic knowledge from the World Wide W.. - Craven, DiPasquo et al. - 1998
138   Bayesian network classifiers - Friedman, Geiger et al. - 1997
135   A sequential algorithm for training text classifiers - Lewis, Gale - 1994
130   A probabilistic analysis of the Rocchio algorithm with TFIDF.. - Joachims - 1997
128   the optimality of the simple Bayesian classifier under zero-.. - Domingos, Pazzani - 1997
121   An analysis of Bayesian classifiers - Langley, Iba et al. - 1992
80   Learning to classify text from labeled and unlabeled documen.. - Nigam, McCallum et al. - 1998
76   A bayesian approach to filtering junk e-mail - Sahami, Dumais et al. - 1998
73   An evaluation of phrasal and clustered representations on a .. (context) - Lewis - 1992
59   Data Mining and Knowledge Discovery (context) - Friedman, variance et al. - 1997
43   Combining classifiers in text categorization - Larkey, Croft - 1996
41   Feature selection in statistical learning of text categoriza.. (context) - Yang, Pederson - 1997
38   Active learning with committees for text categorization (context) - Liere, Tadepalli - 1997
28   Learning limited dependence Bayesian classifiers - Sahami - 1996
18   Improving text clasification by shrinkage in a hierarchy of .. (context) - McCallum, Rosenfeld et al. - 1998
14   Document classification using a finite mixture model - Li, Yamanishi - 1997
12   Estimations of dependences based on statistical data (context) - Vapnik - 1982
5   Document classification by machine: Theory and practice - Guthrie, Walker - 1994
1   at forty: The independence asssumption in information retrie.. (context) - Lewis, Bayes - 1998



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.cmu.edu/People/mccallum/):   More
Learning to Classify Text from Labeled and Unlabeled Documents - Nigam (1998)   (Correct)
Building Domain-Specific Search Engines with Machine .. - McCallum, Nigam.. (1999)   (Correct)
Distributional Clustering of Words for Text Classification - Baker, McCallum (1998)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC