@MISC{Rowlands_generalterms, author = {Tom Rowlands and David Hawking}, title = {General Terms}, year = {} }
Bookmark
OpenURL
Abstract
In real world use of test collection methods, it is essential that the query test set be representative of the work load expected in the actual application. Using a random sample of queries from a media company’s query log as a ‘gold standard ’ test set we demonstrate that biases in sitemap-derived and top n query sets can lead to significant perturbations in engine rankings and big differences in estimated performance levels.