Results 1 -
6 of
6
Supporting Polyrepresentation in a Quantum-inspired Geometrical Retrieval Framework
- In IIiX, 2010 forthcoming
"... ABSTRACT The relevance of a document has many facets, going beyond the usual topical one, which have to be considered to satisfy a user's information need. Multiple representations of documents, like user-given reviews or the actual document content, can give evidence towards certain facets of ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
(Show Context)
ABSTRACT The relevance of a document has many facets, going beyond the usual topical one, which have to be considered to satisfy a user's information need. Multiple representations of documents, like user-given reviews or the actual document content, can give evidence towards certain facets of relevance. In this respect polyrepresentation of documents, where such evidence is combined, is a crucial concept to estimate the relevance of a document. In this paper, we discuss how a geometrical retrieval framework inspired by quantum mechanics can be extended to support polyrepresentation. We show by example how different representations of a document can be modelled in a Hilbert space, similar to physical systems known from quantum mechanics. We further illustrate how these representations are combined by means of the tensor product to support polyrepresentation, and discuss the case that representations of documents are not independent from a user point of view. Besides giving a principled framework for polyrepresentation, the potential of this approach is to capture and formalise the complex interdependent relationships that the different representations can have between each other.
P.: Preliminary experiments using subjective logic for the polyrepresentation of information needs
- In: Proceedings IIiX 2012, ACM (2012) 174–183
"... According to the principle of polyrepresentation, retrieval accuracy may improve through the combination of multiple and diverse information object representations about e.g. the context of the user, the information sought, or the re-trieval system [9, 10]. Recently, the principle of polyrep-resenta ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
According to the principle of polyrepresentation, retrieval accuracy may improve through the combination of multiple and diverse information object representations about e.g. the context of the user, the information sought, or the re-trieval system [9, 10]. Recently, the principle of polyrep-resentation was mathematically expressed using subjective logic [12], where the potential suitability of each represen-tation for improving retrieval performance was formalised through degrees of belief and uncertainty [15]. No experi-mental evidence or practical application has so far validated this model. We extend the work of Lioma et al. (2010) [15], by pro-viding a practical application and analysis of the model. We show how to map the abstract notions of belief and uncer-tainty to real-life evidence drawn from a retrieval dataset. We also show how to estimate two different types of polyrep-resentation assuming either (a) independence or (b) depen-dence between the information objects that are combined. We focus on the polyrepresentation of different types of con-text relating to user information needs (i.e. work task, user background knowledge, ideal answer) and show that the sub-jective logic model can predict their optimal combination prior and independently to the retrieval process.
A subjective logic formalisation of the principle of . . .
, 2010
"... Interactive Information Retrieval refers to the branch of In-formation Retrieval that considers the retrieval process with respect to a wide range of contexts, which may affect the user’s information seeking experience. The identification and representation of such contexts has been the object of th ..."
Abstract
- Add to MetaCart
Interactive Information Retrieval refers to the branch of In-formation Retrieval that considers the retrieval process with respect to a wide range of contexts, which may affect the user’s information seeking experience. The identification and representation of such contexts has been the object of the principle of Polyrepresentation, a theoretical framework for reasoning about different representations arising from interactive information retrieval in a given context. Although the principle of Polyrepresentation has received attention from many researchers, not much empirical work has been done based on it. One reason may be that it has not yet been formalised mathematically. In this paper we propose an up-to-date and flexible mathematical formalisation of the principle of Polyrepresentation for information needs. Specifically, we apply Subjective Logic to model different representations of information needs as beliefs marked by degrees of uncertainty. We combine such beliefs using different logical operators, and we discuss these combinations with respect to different retrieval scenarios and situations. A formal model is introduced and discussed, with illustrative applications to the modelling of information needs.
WWW.JOURNALOFCOMPUTING.ORG Studies on Relevance, Ranking and Results Display
"... Abstract—This study considers the extent to which users with the same query agree as to what is relevant, and how what is considered relevant may translate into a retrieval algorithm and results display. To combine user perceptions of relevance with algorithm rank and to present results, we created ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—This study considers the extent to which users with the same query agree as to what is relevant, and how what is considered relevant may translate into a retrieval algorithm and results display. To combine user perceptions of relevance with algorithm rank and to present results, we created a prototype digital library of scholarly literature. We confine studies to one population of scientists (paleontologists), one domain of scholarly scientific articles (paleo-related), and a prototype system (PaleoLit) that we built for the purpose. Based on the principle that users do not pre-suppose answers to a given query but that they will recognize what they want when they see it, our system uses a rules-based algorithm to cluster results into fuzzy categories with three relevance levels. Our system matches at least 1/3 of our participants ’ relevancy ratings 87 % of the time. Our subsequent usability study found that participants trusted our uncertainty labels but did not value our color-coded horizontal results layout above a standard retrieval list. We posit that users make such judgments in limited time, and that time optimization per task might help explain some of our findings. Index Terms—knowledge retrieval; uncertainty, “fuzzy, ” and probabilistic reasoning; knowledge representation formalisms and methods
Machine Learning manuscript No. (will be inserted by the editor) Evaluating Classifiers Without Expert Labels
"... the date of receipt and acceptance should be inserted later Abstract This paper considers the challenge of evaluating a set of classifiers, as done in shared task evaluations like the KDD Cup or NIST TREC, without ex-pert labels. While expert labels provide the traditional cornerstone for evaluating ..."
Abstract
- Add to MetaCart
(Show Context)
the date of receipt and acceptance should be inserted later Abstract This paper considers the challenge of evaluating a set of classifiers, as done in shared task evaluations like the KDD Cup or NIST TREC, without ex-pert labels. While expert labels provide the traditional cornerstone for evaluating statistical learners, limited or expensive access to experts represents a practical bottleneck. Instead, we seek methodology for estimating performance of the clas-sifiers (relative and absolute) which is more scalable than expert labeling yet pre-serves high correlation with evaluation based on expert labels. We consider both: 1) using only labels automatically generated by the classifiers themselves (blind evaluation); and 2) using labels obtained via crowdsourcing. While crowdsourcing methods are lauded for scalability, using such data for evaluation raises serious concerns given the prevalence of label noise. In regard to blind evaluation, two broad strategies are investigated: combine & score and score & combine. Combine & Score methods infer a single “pseudo-gold ” label set by aggregating classifier labels; classifiers are then evaluated based on this single pseudo-gold label set. On the other hand, score & combine methods: i) sample multiple label sets from clas-sifier outputs, ii) evaluate classifiers on each label set, and iii) average classifier performance across label sets. When additional crowd labels are also collected, we investigate two alternative avenues for exploiting them: 1) direct evaluation of classifiers; or 2) supervision of combine-and-score methods. To assess generality of our techniques, classifier performance is measured using four common classification metrics, with statistical significance tests establishing relative performance of the classifiers for each metric. Finally, we measure both score and rank correlations be-tween estimated classifier performance vs. actual performance according to expert judgments. Rigorous evaluation of classifiers from the TREC 2011 Crowdsourcing Track shows reliable evaluation can be achieved without reliance on expert labels.
Geometrical Retrieval Framework
"... Publication date: 2010 Document Version Early version, also known as pre-print Link to publication from Aalborg University Citation for published version (APA): ..."
Abstract
- Add to MetaCart
(Show Context)
Publication date: 2010 Document Version Early version, also known as pre-print Link to publication from Aalborg University Citation for published version (APA):