Results 1 - 10
of
246
Query Expansion Using Local and Global Document Analysis
- In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 1996
"... Automatic query expansion has long been suggested as a technique for dealing with the fundamental issue of word mismatch in information retrieval. A number of approaches to expansion have been studied and, more recently, attention has focused on techniques that analyze the corpus to discover word re ..."
Abstract
-
Cited by 600 (24 self)
- Add to MetaCart
Automatic query expansion has long been suggested as a technique for dealing with the fundamental issue of word mismatch in information retrieval. A number of approaches to expansion have been studied and, more recently, attention has focused on techniques that analyze the corpus to discover word relationships (global techniques) and those that analyze documents retrieved by the initial query ( local feedback). In this paper, we compare the effectiveness of these approaches and show that, although global analysis has some advantages, local analysis is generally more effective. We also show that using global analysis techniques, such as word context and phrase structure, on the local set of documents produces results that are both more effective and more predictable than simple local feedback. 1 Introduction The problem of word mismatch is fundamental to information retrieval. Simply stated, it means that people often use different words to describe concepts in their queries than auth...
Okapi at TREC-3
, 1996
"... this document length correction factor is #global": it is added at the end, after the weights for the individual terms have been summed, and is independentofwhich terms match. ..."
Abstract
-
Cited by 593 (5 self)
- Add to MetaCart
(Show Context)
this document length correction factor is #global": it is added at the end, after the weights for the individual terms have been summed, and is independentofwhich terms match.
TextTiling: Segmenting text into multi-paragraph subtopic passages
- Computational Linguistics
, 1997
"... TextTiling is a technique for subdividing texts into multi-paragraph units that represent passages, or subtopics. The discourse cues for identifying major subtopic shifts are patterns of lexical co-occurrence and distribution. The algorithm is fully implemented and is shown to produce segmentation t ..."
Abstract
-
Cited by 456 (2 self)
- Add to MetaCart
(Show Context)
TextTiling is a technique for subdividing texts into multi-paragraph units that represent passages, or subtopics. The discourse cues for identifying major subtopic shifts are patterns of lexical co-occurrence and distribution. The algorithm is fully implemented and is shown to produce segmentation that corresponds well to human judgments of the subtopic boundaries of 12 texts. Multi-paragraph subtopic segmentation should be useful for many text analysis tasks, including information retrieval and summarization. 1.
Improving the Effectiveness of Informational Retrieval with Local Context Analysis
- ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2000
"... Techniques for automatic query expansion have been extensively studied in information retrieval research as a means of addressing the word mismatch between queries and documents. These techniques can categorized as either global or local. While global techniques rely on analysis of a whole collec ..."
Abstract
-
Cited by 197 (5 self)
- Add to MetaCart
Techniques for automatic query expansion have been extensively studied in information retrieval research as a means of addressing the word mismatch between queries and documents. These techniques can categorized as either global or local. While global techniques rely on analysis of a whole collection to discover word relationships, local techniques emphasize analysis of the top ranked documents retrieved for a query. Both types of techniques have advantages and limitations. In this paper we propose a new technique, called local context analysis, which combines the advantages of a global technique called Phrasefinder and a local technique known as local feedback. Experiments on a number of collections, both English and non-English, show that local context analysis offers more effective and consistent retrieval results.
A Critique and Improvement of an Evaluation Metric for Text Segmentation
- COMPUTATIONAL LINGUISTICS
, 2002
"... ..."
Advantages of query biased summaries in information retrieval
- In Proceedings of ACM SIGIR
, 1998
"... www.dcs.gla.ac.uk/-tombrosa / www-ciir.cs.umass.edu/-sanderso/ Abstract This paper presents an investigation into the utility of document summarisation in the context of information retrieval, more specifically in the application of so called query biased (or user directed) summaries: summaries cust ..."
Abstract
-
Cited by 150 (7 self)
- Add to MetaCart
www.dcs.gla.ac.uk/-tombrosa / www-ciir.cs.umass.edu/-sanderso/ Abstract This paper presents an investigation into the utility of document summarisation in the context of information retrieval, more specifically in the application of so called query biased (or user directed) summaries: summaries customised to reflect the information need expressed in a query. Employed in the retrieved document list displayed after a retrieval took place, the summaries ’ utility was evaluated in a task-based environment by measuring users ’ speed and accuracy in identifying relevant documents. This was compared to the performance achieved when users were presented with the more typical output of an IR system: a static predefined summary composed of the title and first few sentences of retrieved documents. The results from the evaluation indicate that the use of query biased summaries significantly improves both the accuracy and speed of user relevance judgements. 1
Quantitative evaluation of passage retrieval algorithms for question answering
- IN PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR
, 2003
"... Passage retrieval is an important component common to many question answering systems. Because most evaluations of question answering systems focus on end-to-end performance, comparison of common components becomes difficult. To address this shortcoming, we present a quantitative evaluation of vario ..."
Abstract
-
Cited by 123 (7 self)
- Add to MetaCart
(Show Context)
Passage retrieval is an important component common to many question answering systems. Because most evaluations of question answering systems focus on end-to-end performance, comparison of common components becomes difficult. To address this shortcoming, we present a quantitative evaluation of various passage retrieval algorithms for question answering, implemented in a framework called Pauchok. We present three important findings: Boolean querying schemes perform well in the question answering task. The performance differences between various passage retrieval algorithms vary with the choice of document retriever, which suggests significant interactions between document retrieval and passage retrieval. The best algorithms in our evaluation employ density-based measures for scoring query terms. Our results reveal future directions for passage retrieval and question answering.
COMBINING APPROACHES TO INFORMATION RETRIEVAL
"... The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the “meta-search” engines used on the W ..."
Abstract
-
Cited by 115 (3 self)
- Add to MetaCart
The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the “meta-search” engines used on the Web. This paper examines the development of this technique, including both experimental results and the retrieval models that have been proposed as formal frameworks for combination. We show that combining approaches for information retrieval can be modeled as combining the outputs of multiple classifiers based on one or more representations, and that this simple model can provide explanations for many of the experimental results. We also show that this view of combination is very similar to the inference net model, and that a new approach to retrieval based on language models supports combination and can be integrated with the inference net model.
Automatic Text Decomposition Using Text Segments and Text Themes
"... With the widespread use of full-text information retrieval, passage-retrieval techniques are becoming increasingly popular. Larger texts can then be replaced by important text excerpts, thereby simplifying the retrieval task and improving retrieval effectiveness. Passagelevel evidence about the use ..."
Abstract
-
Cited by 96 (3 self)
- Add to MetaCart
With the widespread use of full-text information retrieval, passage-retrieval techniques are becoming increasingly popular. Larger texts can then be replaced by important text excerpts, thereby simplifying the retrieval task and improving retrieval effectiveness. Passagelevel evidence about the use of words in local contexts is also useful for resolving language ambiguities and improving retrieval output. Two main text decomposition strategies are introduced in this study, including a chronological decomposition into text segments, and semantic decomposition into text themes. The interaction between text segments and text themes is then used to characterize text structure, and to formulate specifications for information retrieval, text traversal, and text summarization. KEYWORDS: Text structuring, text decomposition, segments, themes, information retrieval, passage retrieval, text summarization. TEXT PASSAGES AND TEXT RELATIONSHIP MAPS With the advent of full-text document processing...