Results 1 - 10
of
20
Improving text retrieval for the routing problem using latent semantic indexing
- In Proc. of the 17th ACM-SIGIR Conference
, 1994
"... Latent Semantic Indexing (LSI) is a novel approach to information retrieval that attempts to model the underlying structure of term associations by transforming the traditional representation of documents as vectors of weighted term frequencies to a new coordinate space where both documents and term ..."
Abstract
-
Cited by 83 (2 self)
- Add to MetaCart
Latent Semantic Indexing (LSI) is a novel approach to information retrieval that attempts to model the underlying structure of term associations by transforming the traditional representation of documents as vectors of weighted term frequencies to a new coordinate space where both documents and terms are represented as linear combinations of underlying semantic factors. In previous research, LSI has produced a small improvement in retrieval performance. In this paper, we apply LSI to the routing task, which operates under the assumption that a sample of relevant and non-relevant documents is available to use in constructing the query. Once again, LSI slightly improves performance. However, when LSI is used is conduction with statistical classification, there is a dramatic improvement in performance. 1
A survey of information retrieval and filtering methods
, 1995
"... We survey the major techniques for information retrieval. In the rst part, weprovide an overview of the traditional ones (full text scanning, inversion, signature les and clustering). In the second part we discuss attempts to include semantic information (natural language processing, latent semantic ..."
Abstract
-
Cited by 82 (0 self)
- Add to MetaCart
We survey the major techniques for information retrieval. In the rst part, weprovide an overview of the traditional ones (full text scanning, inversion, signature les and clustering). In the second part we discuss attempts to include semantic information (natural language processing, latent semantic indexing and neural networks).
Machine learning for information retrieval: neural networks, symbolic learning, and genetic algorithms
- Journal of the American Society for Information Science
, 1995
"... Information retrieval using probabilistic techniques has at-tracted significant attention on the part of researchers in information and computer science over the past few de-cades. In the 198Os, knowledge-based techniques also made an impressive contribution to “intelligent ” informa-tion retrieval ..."
Abstract
-
Cited by 56 (9 self)
- Add to MetaCart
Information retrieval using probabilistic techniques has at-tracted significant attention on the part of researchers in information and computer science over the past few de-cades. In the 198Os, knowledge-based techniques also made an impressive contribution to “intelligent ” informa-tion retrieval and indexing. More recently, information sci-ence researchers have turned to other newer artificial-in-telligence-based inductive learning techniques including neural networks, symbolic learning, and genetic algo-rithms. These newer techniques, which are grounded on diverse paradigms, have provided great opportunities for researchers to enhance the information processing and re-trieval capabilities of current information storage and re-trieval systems. In this article, we first provide an overview of these newer techniques and their use in information science research. To familiarize readers with these tech-niques, we present three popular methods: the connec-tionist Hopfield network; the symbolic ID3/ID5R; and evolu-tion-based genetic algorithms. We discuss their knowl-edge representations and algorithms in the context of information retrieval. Sample implementation and testing results from our own research are also provided for each technique. We believe these techniques are promising in their ability to analyze user queries, identify users ’ infor-mation needs, and suggest alternatives for search. With proper user-system interactions, these methods can greatly complement the prevailing full-text, keyword-based, probabilistic, and knowledge-based techniques.
Regularizing ad hoc retrieval scores
, 2005
"... The cluster hypothesis states: closely related documents tend to be relevant to the same request. We exploit this hypothesis directly by adjusting ad hoc retrieval scores from an initial retrieval so that topically related documents receive similar scores. We refer to this process as score regulariz ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
The cluster hypothesis states: closely related documents tend to be relevant to the same request. We exploit this hypothesis directly by adjusting ad hoc retrieval scores from an initial retrieval so that topically related documents receive similar scores. We refer to this process as score regularization. Score regularization can be presented as an optimization problem, allowing the use of results from semisupervised learning. We demonstrate that regularized scores consistently and significantly rank documents better than un-regularized scores, given a variety of initial retrieval algorithms. We evaluate our method on two large corpora across a substantial number of topics.
Relevance and Reinforcement in Interactive Browsing
, 2000
"... We consider the problem of browsing the top ranked portion of the documents returned by an information retrieval system. We describe an interactive relevance feedback agent that analyzes the inter-document similarities and can help the user to locate the interesting information quickly. We show how ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
We consider the problem of browsing the top ranked portion of the documents returned by an information retrieval system. We describe an interactive relevance feedback agent that analyzes the inter-document similarities and can help the user to locate the interesting information quickly. We show how such an agent can be designed and improved by using neural networks and reinforcement learning. We demonstrate that its performance significantly exceeds the performance of the traditional relevance feedback approach.
Regularizing query-based retrieval scores
- Information Retrieval
, 2007
"... Abstract. We adapt the cluster hypothesis for score-based information retrieval by claiming that closely related documents should have similar scores. Given a retrieval from an arbitrary system, we describe an algorithm which directly optimizes this objective by adjusting retrieval scores so that to ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Abstract. We adapt the cluster hypothesis for score-based information retrieval by claiming that closely related documents should have similar scores. Given a retrieval from an arbitrary system, we describe an algorithm which directly optimizes this objective by adjusting retrieval scores so that topically related documents receive similar scores. We refer to this process as score regularization. Because score regularization operates on retrieval scores, regardless of their origin, we can apply the technique to arbitrary initial retrieval rankings. Document rankings derived from regularized scores, when compared to rankings derived from un-regularized scores, consistently and significantly result in improved performance given a variety of baseline retrieval algorithms. We also present several proofs demonstrating that regularization generalizes methods such as pseudo-relevance feedback, document expansion, and cluster-based retrieval. Because of these strong empirical and theoretical results, we argue for the adoption of score regularization as general design principle or post-processing step for information retrieval systems.
Applications of Machine Learning in Information Retrieval
, 1997
"... Information retrieval systems provide access to collections of thousands, or millions, of documents, from which, by providing an appropriate description, users can recover any one. Typically, users iteratively refine the descriptions they provide to satisfy their needs, and retrieval systems can uti ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Information retrieval systems provide access to collections of thousands, or millions, of documents, from which, by providing an appropriate description, users can recover any one. Typically, users iteratively refine the descriptions they provide to satisfy their needs, and retrieval systems can utilize user feedback on selected documents to indicate the accuracy of
Adaptive Concept-based Retrieval Using a Neural Network
- In Proceedings of ACM SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval
, 2000
"... There is considerable interest in bridging the gap between the terminology used in defining queries and the terminology used in representing documents. Some approaches use rules to capture user query concepts. The rules are usually weighted and expressed by AND/OR logical connectives. One difficulty ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
There is considerable interest in bridging the gap between the terminology used in defining queries and the terminology used in representing documents. Some approaches use rules to capture user query concepts. The rules are usually weighted and expressed by AND/OR logical connectives. One difficulty with these approaches is that the resulting performance is quite sensitive to the weight assignments. We develop a neural network model in which the rule weights can be adjusted by users' relevance feedback. Experiments are performed on a small document collection and the adjusted rules show excellent performance in terms of recallprecision values. 1. Introduction Many intelligent retrieval approaches [3, 7, 8] have tried to bridge the terminological gap that exists between the way in which users specify their information needs and the way in which queries are expressed. One proposed approach involves using production rules to capture user query concepts. The main ideas underlying such an...
Adaptive Feedback Methods in an Extended Boolean Model
- University of Southern California, Los Angeles, USA. During
, 2001
"... Relevance feedback methods have been used in information retrieval to generate improved query formulations based on information contained in previously retrieved documents. The relevance feedback techniques have been applied to extended Boolean query formulations as well as to vector query formulati ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Relevance feedback methods have been used in information retrieval to generate improved query formulations based on information contained in previously retrieved documents. The relevance feedback techniques have been applied to extended Boolean query formulations as well as to vector query formulations. In this paper, we propose an adaptive way to improve the retrieval performance in an extended Boolean model. We develop a neural network model in which the weights used in extended Boolean queries can be adjusted by users relevance feedback. Experiments are performed on a TREC collection and the results show improved performance even after applying the previous feedback methods.
Neural Network Agents for Learning Semantic Text Classification
- Information Retrieval
, 2000
"... The research project AgNeT develops Agents for Neural Text routing in the internet. Unrestricted potentially faulty text messages arrive at a certain delivery point (e.g. email address or world wide web address). These text messages are scanned and then distributed to one of several expert agents ac ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The research project AgNeT develops Agents for Neural Text routing in the internet. Unrestricted potentially faulty text messages arrive at a certain delivery point (e.g. email address or world wide web address). These text messages are scanned and then distributed to one of several expert agents according to a certain task criterium. Possible specific scenarios within this framework include the learning of the routing of publication titles or news titles. In this paper we describe extensive experiments for semantic text routing based on classified library titles and newswire titles.

