7 citations found. Retrieving documents...
Z. Lu, J.P. Callan, and W.B. Croft. Measures in collection ranking evaluation. Technical Report 96-39, Department of Computer Science, University of Massachusetts, 1996. Query-Based Sampling of Text Databases  31

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Metrics for Evaluating Database Selection Techniques - French, Powell (1999)   (1 citation)  (Correct)

....merit contributed by all the databases that are useful to the query. Thus, b Rn (E; B) is a measure of how much of the total merit has been accumulated via the top n databases in the estimated ranking. This measure has also been proposed by Lu et al. and was used to report results by French et al. [Lu et al. 1996; French et al. 1998; French et al. 1999b; French et al. 1999a] These two measures are clearly related. Since Rn (E; B) n X i=1 B i = b Rn (E; B) n X i=1 B i ; 7) we have Rn (E; B) b Rn (E; B) and Rn (E; B) b Rn (E; B) Note that in the remainder of the paper we will simplify ....

Lu, Z., J. P. Callan, and W. B. Croft (1996), "Measures in Collection Ranking Evaluation," Technical Report TR-96-39, Computer Science Department, University of Massachusetts.


Metrics for Evaluating Database Selection Techniques - French, Powell (1999)   (1 citation)  (Correct)

....B i : 4) The denominator is just the total merit contributed by all the databases that are useful to the query. Thus, b R n (E; B is a measure of how much of the total merit has been accumulated via the top n databases in the estimated ranking. This measure has also been proposed by Lu et al.[10] and was used to report results by French et al. 4, 3, 2] These two measures are clearly related. Since R n (E; B) n X i=1 B i = b R n (E; B) n X i=1 B i ; 5) we have R n (E; B) b R n (E; B) and R n (E; B) b R n (E; B) Note that in the remainder of the paper we will ....

Z. Lu, J. P. Callan, and W. B. Croft. Measures in collection ranking evaluation. Technical Report TR-96-39, Computer Science Department, University of Massachusetts, 1996.


Query-Based Sampling of Text Databases - Callan, Connell (1999)   (23 citations)  Self-citation (Callan)   (Correct)

....desirable metric when the accuracy of the database ranking algorithm is to be measured independently of other system components, and when the goal is to rank databases containing many relevant documents ahead of databases containing few relevant documents. 4 The metric called R was called R in [23]. We use the more recent and more widely known name, R, in this paper. 20 J. Callan and M. Connell Complete Learned, 700 docs Learned, 300 docs Learned, 100 docs 0 20 40 60 80 100 0 20 40 60 80 100 Percentage of collections searched (a) ....

Z. Lu, J.P. Callan, and W.B. Croft. Measures in collection ranking evaluation. Technical Report 96-39, Department of Computer Science, University of Massachusetts, 1996. Query-Based Sampling of Text Databases  31


Distributed Information Retrieval - Callan (2000)   (15 citations)  Self-citation (Callan)   (Correct)

....0 = R i Gamma R min ) R max Gamma R min ) 5.13) D 0 = D Gamma D min i ) D max i Gamma D min i ) 5.14) D 00 = D 0 0:4 Delta D 0 Delta R 0 i 1:4 (5. 15) In INQUERY, D max i for database R i is calculated by setting the tf component of the tf.idf algorithm to its maximum value (1.0) for each query term; D min i for database R i is calculated by setting the tf component of the tf.idf algorithm to its minimum value (0.0) for each query term. Hence D max i and D min i are estimates of the maximum and minimum scores any document in database R i could be assigned for the given ....

....D 0 Delta R 0 i 1:4 (5.15) In INQUERY, D max i for database R i is calculated by setting the tf component of the tf.idf algorithm to its maximum value (1. 0) for each query term; D min i for database R i is calculated by setting the tf component of the tf.idf algorithm to its minimum value (0.0) for each query term. Hence D max i and D min i are estimates of the maximum and minimum scores any document in database R i could be assigned for the given query. Equation 5.14 solves the problem of highly skewed idf scores, because it is effective on testbeds with and without highly skewed idf ....

Lu, Z., Callan, J., and Croft, W. (1996b). Measures in collection ranking evaluation. Technical Report 96-39, Department of Computer Science, University of Massachusetts.


Comparing the Performance of Database Selection.. - French, Powell, Callan, .. (1999)   (28 citations)  Self-citation (Callan)   (Correct)

....is the breakpoint between the useful and useless databases. The denominator is just the total merit contributed by all the databases that are useful to the query. Thus, b Rn is a measure of how much of the total merit has been accumulated via the top n databases in the estimated ranking. Lu et al.[17] have also suggested using this measure. Gravano et al. 13] have also proposed a precisionrelated measure, Pn . It is defined as follows. Pn = jfdb 2 Topn (E)jmerit(q; db) 0gj jT opn (E)j (11) This gives the fraction of the top n databases in the estimated ranking that have non zero merit. ....

....000) databases. Early research on database selection was based on testbeds of O(10) databases [7, 15] and suggested that selection algorithms were very accurate. However, as the algorithms were improved and applied to testbeds of O(100) databases, the experimental results became more ambiguous [13, 17, 24]. A high priority for our recent research has been to determine whether the CORI algorithm remains effective as the number of databases increases. 5.1 Testbeds Experiments were conducted with the 236 collection testbed used throughout this paper, as well as two additional testbeds of 100 ....

Z. Lu, J. P. Callan, and W. B. Croft. Measures in collection ranking evaluation. Technical Report TR-9639, Computer Science Department, University of Massachusetts, 1996.


Comparing the Performance of Database Selection.. - French, Powell, Callan, .. (1999)   (28 citations)  Self-citation (Callan)   (Correct)

....is the breakpoint between the useful and useless databases. The denominator is just the total merit contributed by all the databases that are useful to the query. Thus, b Rn is a measure of how much of the total merit has been accumulated via the top n databases in the estimated ranking. Lu et al.[17] have also suggested using this measure. Gravano et al. 13] have also proposed a precision related measure, Pn . It is defined as follows. Pn = jfdb 2 Topn(E)jmerit(q; db) 0gj jT opn(E)j (8) This gives the fraction of the top n databases in the estimated ranking that have non zero merit. ....

....000) databases. Early research on database selection was based on testbeds of O(10) databases [7, 15] and suggested that selection algorithms were very accurate. However, as the algorithms were improved and applied to testbeds of O(100) databases, the experimental results became more ambiguous [13, 17, 24]. A high priority for our recent research has been to determine whether the CORI algorithm remains effective as the number of databases increases. 5.1 Testbeds Experiments were conducted with the 236 collection testbed used throughout this paper, as well as two additional testbeds of 100 ....

Z. Lu, J. P. Callan, and W. B. Croft. Measures in collection ranking evaluation. Technical Report TR-96-39, Computer Science Department, University of Massachusetts, 1996.


Effective and Efficient Automatic Database Selection - French, Powell, Callan (1999)   (2 citations)  Self-citation (Callan)   (Correct)

No context found.

Z. Lu, J. P. Callan, and W. B. Croft. Measures in collection ranking evaluation. Technical Report TR96 -39, Computer Science Department, University of Massachusetts, 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC