Results 11 -
18 of
18
and Retrieval – clustering. General Terms Algorithms, Experimentation, Standardization
"... Under construction… ..."
(Show Context)
Query Performance Analyser - a tool for bridging information retrieval reasearch and instruction
, 2002
"... Information retrieval experiments usually measure the average effectiveness of IR methods developed. The analysis of individual queries is neglected although test results may contain individual test topics where general findings do not hold. The paper argues that, for the real user of an IR system, ..."
Abstract
- Add to MetaCart
(Show Context)
Information retrieval experiments usually measure the average effectiveness of IR methods developed. The analysis of individual queries is neglected although test results may contain individual test topics where general findings do not hold. The paper argues that, for the real user of an IR system, the study of variation in results is even more important than averages. The Interactive Query Performance Analyser (QPA) for information retrieval systems is a tool for analysing and comparing the performance of individual queries. On top of a standard test collection, it gives an instant visualisation of the performance achieved in a given search topic by any user-generated query. In addition to experimental IR research, QPA can be used in user training to demonstrate the characteristics of and compare differences between IR systems and searching strategies. The experiences in applying the tool both in IR experiments and in IR instruction are reported. The need for bridging research and instruction is underlined.
CUMULATED GAIN-BASED INDICATORS OF IR PERFORMANCE
, 2002
"... Modern large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, ..."
Abstract
- Add to MetaCart
Modern large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, it is necessary to develop evaluation approaches and methods that credit IR methods for their ability to retrieve highly relevant documents. This can be done by extending traditional evaluation methods, i.e., recall and precision based on binary relevance assessments, to graded relevance assessments. Alternatively, novel measures based on graded relevance assessments may be developed. This paper proposes three novel measures that compute the cumulative gain the user obtains by examining the retrieval result up to a given ranked position. The first one accumulates the relevance scores of retrieved documents along the ranked result list. The second one is similar but applies a discount factor on the relevance scores in order to devaluate late-retrieved documents. The third one computes the relative-tothe-ideal performance of IR techniques, based on the cumulative gain they are able to yield. The novel measures are defined and discussed and then their use is demonstrated in a case
Cross-Lingual Information Retrieval Problems: Methods and findings for three language pairs
"... In this pa per we will disc ss dictiona ry-baWx cross-la nga ge informa ion retrieva l (CLIR) methods, a d report recent findings a nd problems. We will consider three la nga ge paq9q for CLIR: Finnish to English, English to Finnish, Swedish to English. We show tha t Finnish a nd Swedish ha ve spec ..."
Abstract
- Add to MetaCart
In this pa per we will disc ss dictiona ry-baWx cross-la nga ge informa ion retrieva l (CLIR) methods, a d report recent findings a nd problems. We will consider three la nga ge paq9q for CLIR: Finnish to English, English to Finnish, Swedish to English. We show tha t Finnish a nd Swedish ha ve specia l fea t res, e.g., the freq ency of homograO6 a nd a high freq ency of compo nd words tha a ffect retrieva effectiveness. Especia) y correct word form norma liza ion a d compo nd splitting a re essentia,O We report findings concerning the effectiveness of va rio s q ery tra nsla tion methods, q ery str ct res a nd ling istic tools sed for CLIR. We a so point o t some problems a nd deficiencies in s ch tools. 1. Introdu tion There is a n increa ing a mo nt of f ll text ma teria9 in vaq) s laq a es a va ilaq;q thro gh the Internet a nd other informa,q n s ppliers. Therefore Cross-la nga ge informa tion retrieva l (CLIR) ha s become a n importa t new resea,x a rea (OaWq & Dorr, 1996; Pirkola, 1999). It is a process of selecting d raz ing doc ments in a,z,) a e different from the q ery la nga ge. One of the ma n aq;z9)F es to CLIR is ba ed on bilinga l tra nsla tion dictiona ries. For a n overview of the aFqW6z) es, see (H ll & Greffenstette, 1996; Oa rd & Dorr, 1996; Pirkola , 1999). The ma in problems sociaxq) with dictiona ry-b a ed CLIR aqW (1) phra e identificaO9; a nd tra sla tion, (2) so rce la nga ge a mbig ity, (3) tra nsla tion a mbig ity, (4) the covera ge of dictiona ries, (5) the processing of inflected words, d (6) ntrax la ta ble keys, in paq ic laq proper na mes spelled differently in different la, a es. Translation ambiguity refers to the proportion l incre se of b d keys due to tr nsl tion. Rese rch h s developed m ny effective methods to...
Using Graded Relevance Assesments in IR Evaluation
, 2002
"... This paper proposes evaluation methods based on the use of non-dichotomous relevance judgements in IR experiments. It is argued that evaluation methods should credit IR methods for their ability to retrieve highly relevant documents. This is desirable from the user point of view in modern large IR e ..."
Abstract
- Add to MetaCart
This paper proposes evaluation methods based on the use of non-dichotomous relevance judgements in IR experiments. It is argued that evaluation methods should credit IR methods for their ability to retrieve highly relevant documents. This is desirable from the user point of view in modern large IR environments. The proposed methods are (1) a novel application of P-R curves and average precision computations based on separate recall bases for documents of different degrees of relevance, and (2) generalized recall and precision based directly on multiple grade relevance assessments (i.e., not dichotomizing the assessments). We demonstrate the use of the traditional and the novel evaluation measures in a case study on the effectiveness of query types, based on combinations of query structures and expansion, in retrieving documents of various degrees of relevance. The test was run with a best match retrieval system (InQuery ) in a text database consisting of newspaper articles. In order to gain insight into the retrieval process, one should use both graded relevance assessments and effectiveness measures that enable one to observe the differences, if any, between retrieval methods in retrieving documents of different levels of relevance. In modern times of information overload, one should pay attention, in particular, to the capability of retrieval methods retrieving highly relevant documents.
A Test Collection for the Evaluation of Content-Based
, 2001
"... Content-based image retrieval (CBIR) algorithms have been seen as a promising access method for digital photograph collections. Unfortunately, we have very little evidence of the usefulness of these algorithms in real user needs and contexts. In this paper, we introduce a test collection for the ..."
Abstract
- Add to MetaCart
Content-based image retrieval (CBIR) algorithms have been seen as a promising access method for digital photograph collections. Unfortunately, we have very little evidence of the usefulness of these algorithms in real user needs and contexts. In this paper, we introduce a test collection for the evaluation of CBIR algorithms. In the test collection, the performance testing is based on photograph similarity perceived by end-users in the context of realistic illustration tasks and environment. The building process and the characteristics of the resulting test collection are outlined, including a typology of similarity criteria expressed by the subjects judging the similarity of photographs. A small-scale study on the consistency of similarity assessments is presented. A case evaluation of two CBIR algorithms is reported. The results show clear correlation between the subjects' similarity assessments and the functioning of feature parameters of the tested algorithms. 1
FORMAL DEFINITION OF CONCEPT-BASED QUERY EXPANSION AND CONSTRUCTION
, 2001
"... Abstract: We develop a deductive data model for concept-based query expansion. It is based on three abstraction levels: the conceptual, linguistic and string levels. Concepts and relationships among them are represented at the conceptual level. The linguistic level gives natural language expressions ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract: We develop a deductive data model for concept-based query expansion. It is based on three abstraction levels: the conceptual, linguistic and string levels. Concepts and relationships among them are represented at the conceptual level. The linguistic level gives natural language expressions for concepts. Each expression has one or more matching patterns at the string level. The models specify the matching of the expression in database indices built in varying ways. The data model supports a declarative concept-based query expansion and formulation tool, the ExpansionTool, for heterogeneous IR system environments. Conceptual expansion is implemented by a novel intelligent operator for traversing transitive relationships among cyclic concept networks. The number of expansion links followed, their types, and weights can be used to control expansion. A sample empirical experiment illustrating the use of the ExpansionTool in IR experiments is presented. 1.