Results 1 -
9 of
9
Query polyrepresentation for ranking retrieval systems without relevance judgments
- Journal of the American Society for Information Science and Technology
"... Ranking information retrieval (IR) systems with respect to their effectiveness is a crucial operation during IR evaluation, as well as during data fusion. This paper offers a novel method of approaching the system ranking problem, based on the widely studied idea of polyrepresentation. The principle ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Ranking information retrieval (IR) systems with respect to their effectiveness is a crucial operation during IR evaluation, as well as during data fusion. This paper offers a novel method of approaching the system ranking problem, based on the widely studied idea of polyrepresentation. The principle of polyrepresentation suggests that a single information need can be represented by many query articulations–what we call query aspects. By skimming the top k (where k is small) documents retrieved by a single system for multiple query aspects, we collect a set of documents that are likely to be relevant to a given test topic. Labeling these skimmed documents as putatively relevant lets us build pseudo-relevance judgments without undue human intervention. We report experiments where using these pseudo-relevance judgments delivers a rank ordering of IR systems that correlates highly with rankings based on human relevance judgments. 1 1
Using multiple query aspects to build test collections without human relevance judgments
- In: Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
, 2009
"... Abstract. Collecting relevance judgments (qrels) is an especially challenging part of building an information retrieval test collection. This paper presents a novel method for creating test collections by offering a substitute for relevance judgments. Our method is based on an old idea in IR: a sing ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Collecting relevance judgments (qrels) is an especially challenging part of building an information retrieval test collection. This paper presents a novel method for creating test collections by offering a substitute for relevance judgments. Our method is based on an old idea in IR: a single information need can be represented by many query articulations. We call different articulations of a particular need query aspects. By combining the top k documents retrieved by a single system for multiple query aspects, we build judgment-free qrels whose rank ordering of IR systems correlates highly with rankings based on human relevance judgments. 1
Generative model-based metasearch for data fusion in information retrieval
- In Proc. of JCDL
, 2009
"... “Data fusion ” refers to the problem in information retrieval (IR) where several lists of documents ranked against a query are to be merged into a single ranked list for presentation to a user. Data fusion is also known as “metasearch. ” In a digital library setting data fusion may support operation ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
“Data fusion ” refers to the problem in information retrieval (IR) where several lists of documents ranked against a query are to be merged into a single ranked list for presentation to a user. Data fusion is also known as “metasearch. ” In a digital library setting data fusion may support operations such as federated search based on multiple repository representations. This paper presents a novel approach to the fusion problem: generative model-based Metasearch (GeM). We suggest viewing the appearance of documents in a return set as the outcome of a probabilistic process; some documents are likely to occur in the model, while others are unlikely. Using Bayesian parameter estimation to fit a multinomial distribution based on the return sets to be merged, GeM achieves a final ranking by listing documents in decreasing probability of generation under the induced model. We also introduce what we call “the impatient reader ” approach to normalizing document ranks in service to the fusion operation. We report results from several experiments on TREC data suggesting that GeM, informed with impatient reader document scores, operates at state-of-the-art levels of effectiveness.
Towards Quantum-based DB+IR Processing based on the Principle of Polyrepresentation
"... Abstract. The cognitively motivated principle of polyrepresentation still lacks a theoretical foundation in IR. In this work, we discuss two competing polyrepresentation frameworks that are based on quantum theory. Both approaches support different aspects of polyrepresentation, where one is focused ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
Abstract. The cognitively motivated principle of polyrepresentation still lacks a theoretical foundation in IR. In this work, we discuss two competing polyrepresentation frameworks that are based on quantum theory. Both approaches support different aspects of polyrepresentation, where one is focused on the geometric properties of quantum theory while the other has a strong logical basis. We compare both approaches and outline how they can be combined to express further aspects of polyrepresentation. 1
Exploiting Information Needs and Bibliographics for Polyrepresentative Document Clustering
"... Abstract. In this paper we explore the potential of combining the prin-ciple of polyrepresentation with document clustering. Our idea is dis-cussed and evaluated for polyrepresentation of information needs as wells as for document-based polyrepresentation where bibliographic informa-tion is used as ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper we explore the potential of combining the prin-ciple of polyrepresentation with document clustering. Our idea is dis-cussed and evaluated for polyrepresentation of information needs as wells as for document-based polyrepresentation where bibliographic informa-tion is used as representation. The main idea is to present the user with the highly ranked polyrepresentative clusters to support the search pro-cess. Our evaluation suggests that our approach is capable of increasing retrieval performance, but performance varies for queries with a high or low number of relevant documents. 1
Using Anchor Text for Homepage and Topic Distillation Search Tasks
, 2011
"... Past work suggests that anchor text is a good source of evidence that can be used to improve web searching. Two approaches for making use of this evidence include fusing search results from an anchor text representation and the original text representation based on a document’s relevance score or ra ..."
Abstract
- Add to MetaCart
Past work suggests that anchor text is a good source of evidence that can be used to improve web searching. Two approaches for making use of this evidence include fusing search results from an anchor text representation and the original text representation based on a document’s relevance score or rank position, and combining term frequency from both representations during the retrieval process. Although these approaches have each been tested and compared against baselines, different evaluations have used different baselines; no consistent work enables rigorous cross-comparison between these methods. The purpose of this work is threefold. First, we survey existing fusion methods of using anchor text in search. Second, we compare these methods with common testbeds and web search tasks, with the aim of identifying the most effective fusion method. Third, we try to correlate search performance with the characteristics of a test collection. Our experimental results show that the best performing method in each category can significantly improve search results over a common baseline. However, there is no single technique that consistently outperforms competing approaches across different collections and search tasks.
Social Media Retrieval using Image Features and Structured Text
"... Abstract. Use of XML offers a structured approach for representing information while maintaining separation of form and content. XML information retrieval is different from standard text retrieval in two aspects: the XML structure may be of interest as part of the query; and the information does not ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Use of XML offers a structured approach for representing information while maintaining separation of form and content. XML information retrieval is different from standard text retrieval in two aspects: the XML structure may be of interest as part of the query; and the information does not have to be text. In this paper, we describe an investigation of approaches to retrieve text and images from a large collection of XML documents, performed in the course of our participation in the INEX 2006 Ad Hoc and Multimedia tracks. We evaluate three information retrieval similarity measures: Pivoted Cosine, Okapi BM25 and Dirichlet. We show that on the INEX 2006 Ad Hoc queries Okapi BM25 is the most effective among the three similarity measures used for retrieving text only, while Dirichlet is more suitable when retrieving heterogeneous (text and image) data. Key words: Content-based image retrieval, text-based information retrieval, social media, linear combination of evidence 1
KOSO: A Reference-Ontology for Reuse of Existing Knowledge Organization Systems
"... Abstract. This paper introduces KOSO, an ontology which aims at structuring expert knowledge about different types of knowledge organization systems (KOS) and at providing an organized access to already existing knowledge models. The classification and detailed description of such KOS will be enable ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. This paper introduces KOSO, an ontology which aims at structuring expert knowledge about different types of knowledge organization systems (KOS) and at providing an organized access to already existing knowledge models. The classification and detailed description of such KOS will be enabled by the ontology. Furthermore, existing cross-concordances and interrelations between different KOS will be captured. The basic structure of KOSO is presented and should encourage future discussions and refinements. Keywords: Reference-ontology, ontology engineering, knowledge organization systems, semantic upgrades, semantic interoperability. 1
BTU DBIS ’ Multimodal Wikipedia Retrieval Runs at ImageCLEF 2011
"... Abstract. In this work, we summarize the results of our first partici-pation in the Wikipedia Retrieval task. For our experiments, we rely on a cognitively motivated IR model: the principle of polyrepresentation. The principle’s core hypothesis is that a document is defined by different representati ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. In this work, we summarize the results of our first partici-pation in the Wikipedia Retrieval task. For our experiments, we rely on a cognitively motivated IR model: the principle of polyrepresentation. The principle’s core hypothesis is that a document is defined by different representations such as low-level features, or textual content that can be combined in a structured manner reflecting the user’s information need. For our first participation, we used mono-lingual English retrieval in com-bination with global low-level features without further user interaction or query modification techniques. Our best NOFB reached rank 64 or rank 13 of the mono-lingual English runs. This result is promising as we have not used structural informa-tion about the documents. Additionally, our findings are indicating the correctness of the polyrepresentative hypothesis for multimodal retrieval.