• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Extensions to the STAIRS Study – Empirical Evidence for the Hypothesised Ineffectiveness of Boolean Queries in Large Full-Text Databases (2001)

by E SORMUNEN
Venue:Information Retrieval
Add To MetaCart

Tools

Sorted by:
Results 1 - 3 of 3

Cumulated Gain-based Evaluation of IR Techniques

by Kalervo Järvelin, Jaana Kekäläinen - ACM Transactions on Information Systems , 2002
"... Modem large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, i ..."
Abstract - Cited by 694 (3 self) - Add to MetaCart
Modem large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, it is necessary to develop evaluation approaches and methods that credit IR methods for their ability to retrieve highly relevant documents. This can be done by extending traditional evaluation methods, i.e., recall and precision based on binary relevance assessments, to graded relevance assessments. Alternatively, novel measures based on graded relevance assessments may be developed. This paper proposes three novel measures that compute the cumulative gain the user obtains by examining the retrieval result up to a given ranked position. The first one accumulates the relevance scores of retrieved documents along the ranked result list. The second one is similar but applies a discount factor on the relevance scores in order to devaluate late-retrieved documents. The third one computes the relative-tothe -ideal performance of IR techniques, based on the cumulative gain they are able to yield. The novel measures are defined and discussed and then their use is demonstrated in a case study using TREC data - sample system run results for 20 queries in TREC-7. As relevance base we used novel graded relevance assessments on a four-point scale. The test results indicate that the proposed measures credit IR methods for their ability to retrieve highly relevant documents and allow testing of statistical significance of effectiveness differences. The graphs based on the measures also provide insight into the performance IR techniques and allow interpretation, e.g., from the user point of ...

A retrospective evaluation method for exact-match and best-match queries applying an interactive query performance analyser

by Eero Sormunen - in Crestani, F. et al. (Eds), Advances in Information Retrieval: Proceedings of the 24th European Colloquium on IR Research , 2002
"... Abstract. A retrospective method for the performance comparison of queries based on different IR models is introduced. The method is based on the interactive optimisation of queries by a group of test searchers using a query performance analyser. The case experiment focused on comparing the maximum ..."
Abstract - Cited by 5 (2 self) - Add to MetaCart
Abstract. A retrospective method for the performance comparison of queries based on different IR models is introduced. The method is based on the interactive optimisation of queries by a group of test searchers using a query performance analyser. The case experiment focused on comparing the maximum effectiveness of Boolean exact-match queries, and structured and unstructured best-match queries. The experiment verified the problems in maintaining precision of Boolean queries at high recall levels. Interesting similarities were also observed between structured and unstructured best-match queries challenging the results of earlier studies.
(Show Context)

Citation Context

...iptions in natural language tends to be high (i.e., query exhaustivity was high). In queries without expansion, precision collapses already at low recall levels because of the exact-match requirement =-=[2, 20]-=-. Similar drop is not likely to happen in best-match queries. In the latter study [12], Boolean queries contained few conjunctions (i.e., query exhaustivity was low) to guarantee high recall. Best-mat...

vocabulary: An evaluation

by unknown authors
"... Bibliographic database access using free-text and controlled ..."
Abstract - Add to MetaCart
Bibliographic database access using free-text and controlled
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University