• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Probabilistic Latent Semantic Indexing (1999)

Cached

  • Download as a PDF

Download Links

  • [www-connex.lip6.fr]
  • [www.cs.pitt.edu]
  • [faculty.cs.byu.edu]
  • [www3.cs.pitt.edu]
  • [cs.brown.edu]
  • [www-poleia.lip6.fr]
  • [www.cs.brown.edu]
  • [www.cs.brown.edu]
  • [www-dbs.cs.uni-sb.de]
  • [www.cs.brown.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Thomas Hofmann
Citations:545 - 7 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Hofmann99probabilisticlatent,
    author = {Thomas Hofmann},
    title = {Probabilistic Latent Semantic Indexing},
    year = {1999}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized model is able to deal with domain-specific synonymy as well as with polysemous words. In contrast to standard Latent Semantic Indexing (LSI) by Singular Value Decomposition, the probabilistic variant has a solid statistical foundation and defines a proper generative data model. Retrieval experiments on a number of test collections indicate substantial performance gains over direct term matching methodsaswell as over LSI. In particular, the combination of models with different dimensionalities has proven to be advantageous.

Citations

6231 Maximum likelihood from incomplete data via the EM algorithm - Dempster, Laird - 1977
2699 MJ: Introduction to Modern Information Retrieval - Salton, McGill - 1986
2168 R.A.: Indexing by latent semantic analysis - Deerwester, Dumais, et al. - 1990
612 A view of the EM algorithm that justifies incremental, sparse, and other variants - Neal, Hinton - 1998
595 Finite Mixture Models - McLachlan, Peel - 1997
477 Distributional clustering of english words - Pereira, Tishby, et al. - 1993
375 Probabilistic latent semantic analysis - Hofmann - 1999
128 Latent class models for collaborative filtering - Hofmann, Puzicha - 1999
89 Unsupervised learning from dyadic data - Hofmann, Puzicha - 1998
72 A deterministic annealing approach to clustering - Rose, Gurewitz, et al. - 1990
46 Latent Semantic Indexing (LSI): TREC-3 report," in Overview of the Third Text REtrieval Conference - Dumais - 1994
46 G.(1998): Predicting the Performance of Linearly Combined IR Systems - Vogt, Cottrell
37 Differential geometry and statistics - Murray, Rice - 1993
35 Topic-based language models using EM - Gildea, Hofmann - 1999
26 A view of the EM algorithm that justi es incremental, sparse, and other variants - Neal, Hinton - 1998
7 erential Geometry and Statistics - Murray, Rice, et al. - 1993
6 Aggregate and mixed{ order Markov models for statistical language processing - Saul, Pereira - 1997
2 Latent class models for collaborative ltering - Hofmann - 1999
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University