Results 1 - 10
of
76
Proximity-based document representation for named entity retrieval
- In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
, 2007
"... One aspect in which retrieving named entities is different from retrieving documents is that the items to be retrieved – persons, locations, organizations – are only indirectly described by documents throughout the collection. Much work has been dedicated to finding references to named entities, in ..."
Abstract
-
Cited by 57 (0 self)
- Add to MetaCart
(Show Context)
One aspect in which retrieving named entities is different from retrieving documents is that the items to be retrieved – persons, locations, organizations – are only indirectly described by documents throughout the collection. Much work has been dedicated to finding references to named entities, in particular to the problems of named entity extraction and disambiguation. However, just as important for retrieval performance is how these snippets of text are combined to build named entity representations. We focus on the TREC expert search task where the goal is to identify people who are knowledgeable on a specific topic. Existing language modeling techniques for expert finding assume that terms and person entities are conditionally independent given a document. We present theoretical and experimental evidence that this simplifying assumption ignores information on how named entities relate to document content. To address this issue, we propose a new document representation which emphasizes text in proximity to entities and thus incorporates sequential information implicit in text. Our experiments demonstrate that the proposed model significantly improves retrieval performance. The main contribution of this work is an effective formal method for explicitly modeling the dependency between the named entities and terms which appear in a document.
Formal models for expert finding on dblp bibliography data
- In ICDM
, 2008
"... Finding relevant experts in a specific field is often crucial for consulting, both in industry and in academia. The aim of this paper is to address the expert-finding task in a real world academic field. We present three models for expert finding based on the large-scale DBLP bibliography and Google ..."
Abstract
-
Cited by 40 (4 self)
- Add to MetaCart
(Show Context)
Finding relevant experts in a specific field is often crucial for consulting, both in industry and in academia. The aim of this paper is to address the expert-finding task in a real world academic field. We present three models for expert finding based on the large-scale DBLP bibliography and Google Scholar for data supplementation. The first, a novel weighted language model, models an expert candidate based on the relevance and importance of associated documents by introducing a document prior probability, and achieves much better results than the basic language model. The second, a topic-based model, represents each candidate as a weighted sum of multiple topics, whilst the third, a hybrid model, combines the language model and the topic-based model. We evaluate our system using a benchmark dataset based on human relevance judgments of how well the expertise of proposed experts matches a query topic. Evaluation results show that our hybrid model outperforms other models in nearly all metrics. 1.
Modeling multi-step relevance propagation for expert finding
- In CIKM ’08
, 2008
"... An expert finding system allows a user to type a simple text query and retrieve names and contact information of individuals that possess the expertise expressed in the query. This paper proposes a novel approach to expert finding in large enterprises or intranets by modeling candidate experts (pers ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
(Show Context)
An expert finding system allows a user to type a simple text query and retrieve names and contact information of individuals that possess the expertise expressed in the query. This paper proposes a novel approach to expert finding in large enterprises or intranets by modeling candidate experts (persons), web documents and various relations among them with so-called expertise graphs. As distinct from the stateof-the-art approaches estimating personal expertise through one-step propagation of relevance probability from documents to the related candidates, our methods are based on the principle of multi-step relevance propagation in topicspecific expertise graphs. We model the process of expert finding by probabilistic random walks of three kinds: finite, infinite and absorbing. Experiments on TREC Enterprise Track data originating from two large organizations show that our methods using multi-step relevance propagation improve over the baseline one-step propagation based method in almost all cases.
Language-model-based ranking for queries on RDF-graphs
, 2009
"... The success of knowledge-sharing communities like Wikipedia and the advances in automatic information extraction from textual and Web sources have made it possible to build large “knowledge repositories” such as DBpedia, Freebase, and YAGO. These collections can be viewed as graphs of entities and r ..."
Abstract
-
Cited by 31 (11 self)
- Add to MetaCart
(Show Context)
The success of knowledge-sharing communities like Wikipedia and the advances in automatic information extraction from textual and Web sources have made it possible to build large “knowledge repositories” such as DBpedia, Freebase, and YAGO. These collections can be viewed as graphs of entities and relationships (ER graphs) and can be represented as a set of subject-property-object (SPO) triples in the Semantic-Web data model RDF. Queries can be expressed in the W3C-endorsed SPARQL language or by similarly designed graph-pattern search. However, exact-match query semantics often fall short of satisfying the users ’ needs by returning too many or too few results. Therefore, IR-style ranking models are crucially needed. In this paper, we propose a language-model-based approach to ranking the results of exact, relaxed and keyword-augmented graphpattern queries over RDF graphs such as ER graphs. Our method estimates a query model and a set of result-graph models and ranks results based on their Kullback-Leibler divergence with respect to the query model. We demonstrate the effectiveness of our ranking model by a comprehensive user study.
Routing Questions to the Right Users in Online Communities
, 2009
"... Online forums contain huge amounts of valuable user-generated content. In current forum systems, users have to passively wait for other users to visit the forum systems and read/answer their questions. The user experience for question answering suffers from this arrangement. In this paper, we addres ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
(Show Context)
Online forums contain huge amounts of valuable user-generated content. In current forum systems, users have to passively wait for other users to visit the forum systems and read/answer their questions. The user experience for question answering suffers from this arrangement. In this paper, we address the problem of “pushing ” the right questions to the right persons, the objective being to obtain quick, high-quality answers, thus improving user satisfaction. We propose a framework for the efficient and effective routing of a given question to the top-k potential experts (users) in a forum, by utilizing both the content and structures of the forum system. First, we compute the expertise of users according to the content of the forum system— this is to estimate the probability of a user being an expert for a given question based on the previous question answering of the user. Specifically, we design three models for this task, including a profile-based model, a thread-based model, and a clusterbased model. Second, we re-rank the user expertise measured in probability by utilizing the structural relations among users in a forum system. The results of the two steps can be integrated naturally in a probabilistic model that computes a final ranking score for each user. Experimental results show that the proposals are very promising.
Ranking Users for Intelligent Message Addressing
"... Abstract. Finding persons who are knowledgeable on a given topic (i.e. Expert Search) has become an active area of recent research [1–3]. In this paper we investigate the related task of Intelligent Message Addressing, i.e., finding persons who are potential recipients of a message under composition ..."
Abstract
-
Cited by 18 (7 self)
- Add to MetaCart
(Show Context)
Abstract. Finding persons who are knowledgeable on a given topic (i.e. Expert Search) has become an active area of recent research [1–3]. In this paper we investigate the related task of Intelligent Message Addressing, i.e., finding persons who are potential recipients of a message under composition given its current contents, its previously-specified recipients or a few initial letters of the intended recipient contact (intelligent auto-completion). We begin by providing quantitative evidence, from a very large corpus, of how frequently email users are subject to message addressing problems. We then propose several techniques for this task, including adaptations of wellknown formal models of Expert Search. Surprisingly, a simple model based on the K-Nearest-Neighbors algorithm consistently outperformed all other methods. We also investigated combinations of the proposed methods using fusion techniques, which leaded to significant performance improvements over the baselines models. In auto-completion experiments, the proposed models also outperformed all standard baselines. Overall, the proposed techniques showed ranking performance of more than 0.5 in MRR over 5202 queries from 36 different email users, suggesting intelligent message addressing can be a welcome addition to email. 1
Associating People and Documents
"... Abstract. Since the introduction of the Enterprise Track at TREC in 2005, the task of finding experts has generated a lot of interest within the research community. Numerous models have been proposed that rank candidates by their level of expertise with respect to some topic. Common to all approache ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
(Show Context)
Abstract. Since the introduction of the Enterprise Track at TREC in 2005, the task of finding experts has generated a lot of interest within the research community. Numerous models have been proposed that rank candidates by their level of expertise with respect to some topic. Common to all approaches is a component that estimates the strength of the association between a document and a person. Forming such associations, then, is a key ingredient in expertise search models. In this paper we introduce and compare a number of methods for building documentpeople associations. Moreover, we make underlying assumptions explicit, and examine two in detail: (i) independence of candidates, and (ii) frequency is an indication of strength. We show that our refined ways of estimating the strength of associations between people and documents leads to significant improvements over the state-of-the-art in the end-toend expert finding task. 1
A language modeling framework for expert finding
- INFORMATION PROCESSING AND MANAGEMENT
, 2008
"... ..."
Multi-Aspect Expertise Matching for Review Assignment
"... Review assignment is a common task that many people such as conference organizers, journal editors, and grant administrators would have to do routinely. As a computational problem, it involves matching a set of candidate reviewers with a paper or proposal to be reviewed. A common deficiency of all e ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
(Show Context)
Review assignment is a common task that many people such as conference organizers, journal editors, and grant administrators would have to do routinely. As a computational problem, it involves matching a set of candidate reviewers with a paper or proposal to be reviewed. A common deficiency of all existing work on solving this problem is that they do not consider the multiple aspects of topics or expertise and all match the entire document to be reviewed with the overall expertise of a reviewer. As a result, if a document contains multiple subtopics, which often happens, existing methods would not attempt to assign reviewers to cover all the subtopics; instead, it is quite possible that all the assigned reviewers would cover the major subtopic quite well, but not covering any other subtopic. In this paper, we study how to model multiple aspects of expertise and assign reviewers so that they together can cover all subtopics in the document well. We propose three general strategies for solving this problem and propose new evaluation measures for this task. We also create a multi-aspect review assignment test set using ACM SIGIR publications. Experiment results on this data set show that the proposed methods are effective for assigning reviewers to cover all topical aspects of a document.