| W. S. Cooper. On selecting a measure of retrieval effectiveness, part I: The `subjective' philosophy of evaluation. Journal of the American Society for Information Science, 24(2):87--100, 1973. |
....whole search process. Secondly, evaluation using a test collection relies on the relevance judgements, which are usually based solely on relatedness to the topic of the query. Su s study [115] found that precision was not correlated with the user s overall rating of the success of a query. Cooper [30] has argued that users do not just want relevant documents, they want documents with high utility, which he described as a catch all concept involving not only topic relatedness but also quality, novelty, importance, credibility, and many other things. Subsequent user studies (summarised by ....
W. S. Cooper. On selecting a measure of retrieval effectiveness, part I: The `subjective' philosophy of evaluation. Journal of the American Society for Information Science, 24(2):87--100, 1973.
.... exhausting survey on relevance see [Miz96] Classical IR research defines relevance mostly as topicality (e.g. the text retrieval conference competition [TREC, Har94] Other researchers also take task oriented utility (also called informativeness [Boy82] into account [Boy82, Soe94, Har92, GL91, Coo73a, Coo73b, Coo78, Bar67] Because of its subjective nature, task oriented utility is harder to grasp and to express in a query. While topicality is often measured on a zero to one scale with zero and one denoting non relevant and fully relevant, utility oriented relevance may use different, e.g. ....
....with ranked outputs. Query ranks can have various types of impact on relevance. The relevance of lower ranked objects, for example, may decrease by redundancy between objects or by saturation effects, e.g. in entertainment related application areas (see, for example, Boy82, Soe94, Har92, GL91, Coo73a, Coo73b, Coo78, Bar67] 2. Knowledge about query ratings: If queries generate human understandable ratings or if users have the chance to learn the rating scale by query or filtering output being labeled with query ratings, users may be able to associate query ratings with relevance. Users may, ....
W.S. Cooper. On selecting a measure of retrieval effectiveness, Part 1: The "subjective" philosophy of evaluation. Journal of the American Society for Information Science, 24(2):87-100, 1973.
....t i and r i for the non linear case, obviously the linear case will follow by analogy. To show what is involved let me given an example of the estimation process using simple maximum likelihood estimates. The basis for our estimates is the following 2 by 2 table. j(i) i i j(i) x = 1 x = 0 [1] [2] 3] 4] 5] 6] x = 1 x = 0 [7] 8] 9] 98 Information retrieval Here I have adopted a labelling scheme for the cells in which [x] means the number of occurrences in the cell labelled x. Ignoring for the moment the nature of the set on which this table is based; our estimates might be ....
....[9] 98 Information retrieval Here I have adopted a labelling scheme for the cells in which [x] means the number of occurrences in the cell labelled x. Ignoring for the moment the nature of the set on which this table is based; our estimates might be as follows: i P (x = 1 x = 1) t i j(i) [1] [7] i P (x = 1 x = 0) r i j(i) 3] 8] In general we would have two tables of this kind when setting up our function g(x) one for estimating the parameters associated with P(x w 1 ) and one for P(x w 2 ) In the limit we would have complete knowledge of which documents in the ....
[Article contains additional citation context not shown here]
COOPER, W.S., 'On selecting a measure of retrieval effectiveness', Part 1: 'The "subjective" philosophy of evaluation', Part 2: 'Implementation of the philosophy', Journal of the American Society for Information Science, 24, 87-100 and 413-424 (1973).
....variables) Finally, use the coefficients derived from the logistic regression on another set of data to see how accurately the equation can predict the relevance of a session. 9. Sampling Methodology The data for this experiment consists of a smaller subset of the dataset described in Cooper (in press) This experiment uses all 905,970 sessions conducted with the Melvyl Web based library catalog between February and October 1998. The methodology followed in this paper is one proposed by Breiman et al. 1984) His goal was to determine the predictive ability of a solution. Our goal here is similar to see how well the ....
Cooper, W.S. (1973a). On selecting a measure of retrieval effectiveness, part 1: The "subjective" philosophy of evaluation. Journal of the American Society for Information Science, 24, 87--100.
No context found.
Cooper, W. S. (1973a). On selecting a measure of retrieval effectiveness, part 1: The "subjective" philosophy of evaluation, Journal of the American Society for Information Science 24(2): 87--100.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC