• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 1,083
Next 10 →

Probabilistic Latent Semantic Indexing

by Thomas Hofmann , 1999
"... Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized ..."
Abstract - Cited by 1225 (10 self) - Add to MetaCart
Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized

Indexing by latent semantic analysis

by Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman - JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE , 1990
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract - Cited by 3779 (35 self) - Add to MetaCart
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries

Unsupervised Learning by Probabilistic Latent Semantic Analysis

by Thomas Hofmann - Machine Learning , 2001
"... Abstract. This paper presents a novel statistical method for factor analysis of binary and count data which is closely related to a technique known as Latent Semantic Analysis. In contrast to the latter method which stems from linear algebra and performs a Singular Value Decomposition of co-occurren ..."
Abstract - Cited by 618 (4 self) - Add to MetaCart
results for different types of text and linguistic data collections and discusses an application in automated document indexing. The experiments indicate substantial and consistent improvements of the probabilistic method over standard Latent Semantic Analysis.

Using Linear Algebra for Intelligent Information Retrieval

by Michael W. Berry, Susan T. Dumais - SIAM REVIEW , 1995
"... Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users' requests and those in or assigned to documents in a database. Because of the tremendous diversity in the words people use to describe the same document, lexical ..."
Abstract - Cited by 676 (18 self) - Add to MetaCart
by 200-300 of the largest singular vectors are then matched against user queries. We call this retrieval method Latent Semantic Indexing (LSI) because the subspace represents important associative relationships between terms and documents that are not evident in individual documents. LSI is a completely

Latent semantic indexing: A probabilistic analysis

by Christos H. Papadimitriou, Prabhakar Raghavan, Hisao Tamaki , Santosh Vempala , 1998
"... Latent semantic indexing (LSI) is an information retrieval technique based on the spectral analysis of the term-document matrix, whose empirical success had heretofore been without rigorous prediction and explanation. We prove that, under certain conditions, LSI does succeed in capturing the underl ..."
Abstract - Cited by 323 (7 self) - Add to MetaCart
Latent semantic indexing (LSI) is an information retrieval technique based on the spectral analysis of the term-document matrix, whose empirical success had heretofore been without rigorous prediction and explanation. We prove that, under certain conditions, LSI does succeed in capturing

Imagenet: A large-scale hierarchical image database

by Jia Deng, Wei Dong, Richard Socher, Li-jia Li, Kai Li, Li Fei-fei - In CVPR , 2009
"... The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce her ..."
Abstract - Cited by 840 (28 self) - Add to MetaCart
The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce

Recovering Documentation-to-Source-Code Traceability Links using Latent Semantic Indexing

by Andrian Marcus, Jonathan I. Maletic
"... An information retrieval technique, latent semantic indexing, is used to automatically identi traceability links from system documentation to program source code. The results of two experiments to identi links in existing software systems (i.e., the LEDA library, and Albergate) are presented. These ..."
Abstract - Cited by 237 (13 self) - Add to MetaCart
An information retrieval technique, latent semantic indexing, is used to automatically identi traceability links from system documentation to program source code. The results of two experiments to identi links in existing software systems (i.e., the LEDA library, and Albergate) are presented

Distributional Clustering of Words for Text Classification

by L. Douglas Baker, Andrew Kachites Mccallum , 1998
"... This paper describes the application of Distributional Clustering [20] to document classification. This approach clusters words into groups based on the distribution of class labels associated with each word. Thus, unlike some other unsupervised dimensionalityreduction techniques, such as Latent Sem ..."
Abstract - Cited by 298 (1 self) - Add to MetaCart
% accuracy---significantly better than Latent Semantic Indexing [6], class-based clustering [1], feature selection by mutual information [23], or Markov-blanket-based feature selection [13]. We also show that less aggressive clustering sometimes results in improved classification accuracy over classification

Random Indexing of Text Samples for Latent Semantic Analysis

by Pentti Kanerva, Jan Kristoferson, Anders Holst - In Proceedings of the 22nd Annual Conference of the Cognitive Science Society , 2000
"... VD, the result is not nearly as good: only 36% correct. The authors conclude that the reorganization of information by SVD somehow corresponds to human psychology. We have studied high-dimensional random distributed representations, as models of brainlike representation of information (Kanerva, 199 ..."
Abstract - Cited by 123 (4 self) - Add to MetaCart
is represented by a 30,000-bit vector with a single 1 marking the place of the sample in a list of all samples, and call it the sample's index vector (i.e., the nth bit of the index vector for the nth text sample is 1---the representation is unitary or local) . Then the words-by-contexts matrix

Latent Semantic Kernels

by Nello Cristianini, John Shawe-Taylor, HUMA LODHI
"... Kernel methods like Support Vector Machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vector-space representationoftwo documents, in analogy with classical information retrieval (IR) approaches. Latent Semantic In ..."
Abstract - Cited by 114 (9 self) - Add to MetaCart
Kernel methods like Support Vector Machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vector-space representationoftwo documents, in analogy with classical information retrieval (IR) approaches. Latent Semantic
Next 10 →
Results 1 - 10 of 1,083
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University