Results 1 -
6 of
6
Indexing by latent semantic analysis
- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
, 1990
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract
-
Cited by 2168 (30 self)
- Add to MetaCart
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 or-thogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are re-turned. initial tests find this completely automatic method for retrieval to be promising.
The Vocabulary Problem in Human-System Communication
- COMMUNICATIONS OF THE ACM
, 1987
"... In almost all computer applications, users must enter correct words for the desired objects or actions. For success without extensive training, or in first-tries for new targets, the system must recognize terms that will be chosen spontaneously. We studied spontaneous word choice for objects in five ..."
Abstract
-
Cited by 353 (6 self)
- Add to MetaCart
In almost all computer applications, users must enter correct words for the desired objects or actions. For success without extensive training, or in first-tries for new targets, the system must recognize terms that will be chosen spontaneously. We studied spontaneous word choice for objects in five application-related domains, and found the variability to be surprisingly large. In every case two people favored the same term with probability <0.20. Simulations show how this fundamental property of language limits the success of various design methodologies for vocabulary-driven interaction. For example, the popular approach in which access is via one designer's favorite single word will result in 80-90 percent failure rates in many common situations. An optimal strategy, unlimited aliasing, is derived and shown to be capable of several-fold improvements.
Paradox of the Active User
, 1987
"... One of the most sweeping changes ever in the ecology of human cognition may be taking place today. People are beginning to learn and use very powerful and sophisticated information processing technology as a matter of daily life. From the perspective of human history, this could be a transitional po ..."
Abstract
-
Cited by 84 (5 self)
- Add to MetaCart
One of the most sweeping changes ever in the ecology of human cognition may be taking place today. People are beginning to learn and use very powerful and sophisticated information processing technology as a matter of daily life. From the perspective of human history, this could be a transitional point dividing a period when machines merely helped us do things from a period when machines will seriously help us think about things. But if this is so, we are indeed still very much within the transition. For most people, computers have more possibility than they have real practical utility.
Indexing by Latent Semantic Analysis
- Journal of the American Society for Information Science
, 2001
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents ("semantic structure") in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract
- Add to MetaCart
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents ("semantic structure") in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. Initial tests find this completely automatic method for retrieval to be promising. Deerwester - 1 - 1.
Experience With An Adaptive Indexing Scheme
- In Human Factors in Computer Systems, CHI’85 Proceedings
, 1985
"... Previous work has shown that there is a major vocabulary barrier for new or intermittent users of computer systems. The barrier can be substantially lowered with a rich, empirically defined, frequency weighted index. This paper discusses experience with an adaptive technique for constructing such an ..."
Abstract
- Add to MetaCart
Previous work has shown that there is a major vocabulary barrier for new or intermittent users of computer systems. The barrier can be substantially lowered with a rich, empirically defined, frequency weighted index. This paper discusses experience with an adaptive technique for constructing such an index. In addition to being an easy way for system designers to collect the necessary data, an adaptive system has the additional advantage that data is collected from real users in real situations, not in some laboratory approximation. Implementation considerations, preliminary results and future theoretical directions are discussed. 1. Introduction For several years our research group has been studying the words people use when referring to various types of objects and operations [1,2]. We have been interested in this because, mouses and icons notwithstanding, computers will continue to require people to use words to access many things. The problem is that these "access" words are large...
INTELLIGENCE CHINESE DOCUMENT SEMANTIC INDEXING SYSTEM
"... With the rapid growth of the Internet, how to get information from this huge information space becomes an even more important problem. In this paper, An Intelligence Chinese Document Semantic Indexing System; ICDSIS, is proposed. Some new technologies are integrated in ICDSIS to obtain good performa ..."
Abstract
- Add to MetaCart
With the rapid growth of the Internet, how to get information from this huge information space becomes an even more important problem. In this paper, An Intelligence Chinese Document Semantic Indexing System; ICDSIS, is proposed. Some new technologies are integrated in ICDSIS to obtain good performance. ICDSIS is composed of four key procedures. A parallel, distributed and configurable Spider is used for information gather; a multi-hierarchy document classification approach combining the information gain initially processes gathered web documents; a swarm intelligence based document clustering method is used for information organization; a concept-based retrieval interface is applied for user interactive retrieval. ICDSIS is an all-sided solution for information retrieval on the Internet.

