See this document in CiteSeerX!

Using Density Estimation (2002)  (Make Corrections)  
to Improve Text Categorization Carl Sable, Kathy McKeown, and Vasileios...



  Home/Search   Context   Related

 
View or download:
columbia.edu/nlp/p...sable_al_02b.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  columbia.edu/nlp/papers/cubib (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This paper explores the use of a statistical technique known as density estimation to potentially improve the results of text categorization systems which label documents by computing similarities between documents and categories. In addition to potentially improving a system's overall accuracy, density estimation converts similarity scores to probabilities. These probabilities provide con dence measures for a system's predictions which are easily interpretable and could potentially help to... (Update)

Active bibliography (related documents):   More   All
0.6:   Classification Techniques for Categorization of Hypertext Documents - Arumugam   (Correct)
0.3:   Multimodal Video Indexing: A Review of the State-of-the-art - Snoek, Worring (2003)   (Correct)
0.3:   From Region Features to Semantic Labels: A Probabilistic Approach - Li, Leow (2003)   (Correct)

Similar documents based on text:   More   All
0.3:   Using Bins to Empirically Estimate Term Weights for Text.. - Sable, Church (2001)   (Correct)
0.3:   Text-Based Approaches for the Categorization of Images - Sable, Hatzivassiloglou (1999)   (Correct)
0.3:   Software Re-Use and Evolution in Text Generation Applications - Karen Kukich (1997)   (Correct)

BibTeX entry:   (Update)

@misc{ text-using,
  author = "To Improve Text",
  title = "Using Density Estimation",
  url = "citeseer.ist.psu.edu/733071.html" }
Citations (may not include all citations):
512   Density Estimation for Statistics and Data Analysis (context) - Silverman - 1986
463   Term weighting approaches in automatic text retrieval (context) - Salton, Buckley - 1988
432   Automatic Text Processing: The Transformation (context) - Salton - 1989
416   Information Retrieval - van Rijsbergen - 1979
376   Text categorization with support vector machines: Learning w.. - Joachims - 1998
263   A stochastic parts program and noun phrase parser for unrest.. (context) - Church - 1988
166   A re-examination of text categorization methods - Yang, Liu - 1999
149   An evaluation of statistical approaches to text categorizati.. - Yang - 1999
130   A probabilistic analysis of the Rocchio algorithm with TFIDF.. - Joachims - 1997
103   at forty: The independence assumption in information retriev.. (context) - Lewis, Bayes - 1998
76   Boostexter: A boosting-based system for text categorization - Schapire, Singer - 2000
72   Bow: A toolkit for statistical language modeling (context) - McCallum - 1996
62   Reuters-21578 text categorization test collection (context) - Lewis - 1997
21   Training algorithms for linear text classi ers - Lewis, Schapire et al. - 1996
20   Integration of visual and text-based approaches for the cont.. - Paek, Sable et al. - 1999
7   Assessing the calibration of Naive Bayes' posterior estimate.. - Bennett - 2000
7   Using maximum entropy for text classi cation (context) - Nigam, La et al. - 1999
5   Text-based approaches for non-topical image categorization (context) - Sable, Hatzivassiloglou - 2000
3   Indoor-outdoor image classi cation (context) - Szummer, Picard - 1998
1   Using asymmetric distributions to improve classi er probabil.. (context) - Bennett - 2002

Documents on the same site (http://www1.cs.columbia.edu/nlp/papers/cu-bib.html):   More
Domain Word Translation by Space-Frequency Analysis of Context.. - Fung (1996)   (Correct)
Resources for Evaluation of Summarization Techniques - Klavans, McKeown, Kan, Lee (1998)   (Correct)
Word Informativeness and Automatic Pitch Accent Modeling - Pan, McKeown (1999)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC