• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Discriminative Models of Integrating Document Evidence and Document-Candidate Associations for Expert Search (2010)

by Y Fang, L Si, A Mathur
Venue:In SIGIR
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 14
Next 10 →

Learning Models for Ranking Aggregates

by Craig Macdonald, Iadh Ounis
"... Abstract. Aggregate ranking tasks are those where documents are not the final ranking outcome, but instead an intermediary component. For instance, in expert search, a ranking of candidate persons with relevant expertise to a query is generated after consideration of a document ranking. Many models ..."
Abstract - Cited by 10 (1 self) - Add to MetaCart
Abstract. Aggregate ranking tasks are those where documents are not the final ranking outcome, but instead an intermediary component. For instance, in expert search, a ranking of candidate persons with relevant expertise to a query is generated after consideration of a document ranking. Many models exist for aggregate ranking tasks, however obtaining an effective and robust setting for different aggregate ranking tasks is difficult to achieve. In this work, we propose a novel learned approach to aggregate ranking, which combines different document ranking features as well as aggregate ranking approaches. We experiment with our proposed approach using two TREC test collections for expert and blog search. Our experimental results attest the effectiveness and robustness of a learned model for aggregate ranking across different settings. 1
(Show Context)

Citation Context

...ach appears promising, our results on the expert search tasks are 15-19% higher than the best results obtained by their learned approach, showing the superiority of our proposed approach. Fang et al. =-=[22]-=- recently introduced a discriminative approach for expert search. In this approach, the importance of candidate features and association features are automatically learned from the training data. Howe...

Award Prediction with Temporal Citation Network Analysis

by Zaihan Yang, Dawei Yin, Brian D. Davison
"... Each year many ACM SIG communities will recognize an outstanding researcher through an award in honor of his or her profound impact and numerous research contributions. This work is the first to investigate an automated mechanism to help in selecting future award winners. We approach the problem as ..."
Abstract - Cited by 4 (3 self) - Add to MetaCart
Each year many ACM SIG communities will recognize an outstanding researcher through an award in honor of his or her profound impact and numerous research contributions. This work is the first to investigate an automated mechanism to help in selecting future award winners. We approach the problem as a researchers’ expertise ranking problem, and propose a temporal probabilistic ranking model which combines content with citation network analysis. Experimental results based on real-world citation data and historical awardees indicate that some kinds of SIG awards are wellmodeled by this approach.
(Show Context)

Citation Context

... of the approaches in evaluating the expertise of a researcher, different information probabilistic models have been provided, including language model [1], voting model [5], and discriminative model =-=[3]-=-, which mainly emphasize evaluating the relevance between supporting documents and thus the corresponding authors with the query. Another direction of research, which is the research focus of this pos...

Learning to Rank Academic Experts

by Catarina Alexandra Pinto Moreira , 2011
"... ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
Abstract not found

Academic Network Analysis: A Joint Topic Modeling Approach

by Zaihan Yang, Liangjie Hong, Brian D. Davison
"... Abstract—We propose a novel probabilistic topic model that jointly models authors, documents, cited authors, and venues simultaneously in one integrated framework, as compared to previous work which embeds fewer components. This model is designed for three typical applications in academic network an ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract—We propose a novel probabilistic topic model that jointly models authors, documents, cited authors, and venues simultaneously in one integrated framework, as compared to previous work which embeds fewer components. This model is designed for three typical applications in academic network analysis: the problems of expert ranking, cited author prediction and venue prediction. Experiments based on two real world data sets demonstrate the model to be effective, and it outperforms several state-of-the-art algorithms in all three applications.
(Show Context)

Citation Context

... of researchers based on their expertise in that queryspecific domain. Two categories of approaches have been the research focus in the past years: the pure content analysis based approach [1], [16], =-=[5]-=-, which emphasizes evaluating authors’ expertise by measuring the relevance between their associated documents and the query, and the social network based approach [3], [26], [30], [6], [11], which ev...

Features and aggregators for web-scale entity search. arXiv 1303.3164

by Uma Sawant, Soumen Chakrabarti , 2013
"... We focus on two research issues in entity search: how to score a document or snippet that potentially supports a can-didate entity, and how to aggregate or combine scores from different snippets into an entity score. Proximity scoring has been studied in IR outside the scope of entity search. Howeve ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
We focus on two research issues in entity search: how to score a document or snippet that potentially supports a can-didate entity, and how to aggregate or combine scores from different snippets into an entity score. Proximity scoring has been studied in IR outside the scope of entity search. However, aggregation has been hardwired except in a few cases where probabilistic language models are used. We instead explore simple, robust, discriminative ranking algo-rithms, with informative snippet features and broad families of aggregation functions. Our first contribution is a study of proximity-cognizant snippet features. In contrast with prior work which uses hardwired “proximity kernels ” that imple-ment a fixed decay with distance, we present a “universal” feature encoding which jointly expresses the perplexity (in-formativeness) of a query term match and the proximity of the match to the entity mention. Our second contribution is a study of aggregation functions. Rather than train the ranking algorithm on snippets and then aggregate scores, we directly train on entities such that the ranking algorithm takes into account the aggregation function being used. Our third contribution is an extensive Web-scale evaluation of the above algorithms on two data sets having quite different properties and behavior. The first one is the W3C dataset used in TREC-scale enterprise search, with pre-annotated entity mentions. The second is a Web-scale open-domain entity search dataset consisting of 500 million Web pages, which contain about 8 billion token spans annotated auto-matically with two million entities from 200,000 entity types in Wikipedia. On the TREC dataset, the performance of our system is comparable to the currently prevalent systems by Balog et al. (using Boolean associations) and MacDonald et al.. On the much larger and noisier Web dataset, our sys-tem delivers significantly better performance than all other systems, with 8 % MAP improvement over the closest com-petitor. 1.
(Show Context)

Citation Context

...function is itself learnt from entity relevance judgments. As we shall see here, the issue of robust, trainable proximity scoring is far from closed. 1.2 Evidence aggregation With very few exceptions =-=[15, 27]-=-, entity and expert search algorithms in the IR community are heavily biased toward generative language models [12, 1, 14, 2]. In contrast, some of the best-known L2R algorithms use discriminative max...

Mining and analyzing the academic network

by Zaihan Yang , 2014
"... ..."
Abstract - Add to MetaCart
Abstract not found

Retrieval With Applications to Search in Conversational Social Media

by Jonathan L. Elsas, Jonathan L. Elsas, William Cohen
"... This Dissertation is brought to you for free and open access by the Theses and Dissertations at Research Showcase @ CMU. It has been accepted for inclusion in Dissertations by an authorized administrator of Research Showcase @ CMU. For more information, please contact research- ..."
Abstract - Add to MetaCart
This Dissertation is brought to you for free and open access by the Theses and Dissertations at Research Showcase @ CMU. It has been accepted for inclusion in Dissertations by an authorized administrator of Research Showcase @ CMU. For more information, please contact research-
(Show Context)

Citation Context

...s different tasks. Fang et al.’s Discriminative Model Fang et al. take a different approach, taking inspiration from Balog’s Model 2 (Equation 2.8) and modeling the two components as linear functions =-=[50]-=-. The parameterizations of these two linear functions are jointly learned from training data via gradient methods. In contrast to Balog’s generative model, which ranks aggregates by the likelihood of ...

Improving Related Entity Finding via Incorporating Homepages and Recognizing Fine-grained Entities

by Youzheng Wu, Chiori Hori, Hisashi Kawai, Hideki Kashioka
"... This paper describes experiments on the TREC entity track that studies retrieval of homepages representing entities relevant to a query. Many studies have focused on extracting entities that match the given coarse-grained types such as organizations, persons, locations by using a named entity recogn ..."
Abstract - Add to MetaCart
This paper describes experiments on the TREC entity track that studies retrieval of homepages representing entities relevant to a query. Many studies have focused on extracting entities that match the given coarse-grained types such as organizations, persons, locations by using a named entity recognizer, and employing language model techniques to calculate similarities between query and supporting snippets of entities from which entities are extracted to rank the entities. This paper proposes three improvements over baseline, i.e., 1) incorporating homepages of entities to supplement supporting snippets, 2) recognizing fine-grained named entities to filter out or negatively reward extracted entities that do not match the specified fine-grained types of entities such as a university, airline, author, and 3) adopting a dependency tree-based similarity method to improve language model techniques. Our experiments demonstrate that the proposed approaches can significantly improve performance, for instance, the absolute improvements of

FINDING THE RIGHT EXPERT Discriminative Models for Expert Retrieval

by unknown authors
"... We tackle the problem of expert retrieval in Social Question Answering (SQA) sites. In particular, we consider the task of, given an information need in the form of a question posted in a SQA site, ranking potential experts according to the likelihood that they can answer the question. We propose a ..."
Abstract - Add to MetaCart
We tackle the problem of expert retrieval in Social Question Answering (SQA) sites. In particular, we consider the task of, given an information need in the form of a question posted in a SQA site, ranking potential experts according to the likelihood that they can answer the question. We propose a discriminative model (DM) that allows to combine different sources of evidence in a single retrieval model using machine learning techniques. The features used as input for the discriminative model comprise features derived from language models, standard probabilistic retrieval functions and features quantifying the popularity of an expert in the category of the question. As input for the DM, we propose a novel feature design that allows to exploit language models as features. We perform experiments and evaluate our approach on a dataset extracted from Yahoo! Answers, recently used as benchmark in the CriES Workshop, and show that our proposed approach outperforms i) standard probabilistic retrieval models, ii) a state-of-the-art expert retrieval approach based on language models as well as iii) an established learning to rank model. 1
(Show Context)

Citation Context

...idence sources, in particular a categorybased generative model, and by including support for multilingual retrieval. We show that the DM outperforms this mixture language model baseline. Fang et al. (=-=Fang et al., 2010-=-) propose a discriminative model to integrate document evidence and document-candidate associations for expert search. Similar to the DM in this paper, they use available relevance assessments as trai...

Leveraging Collection Structure in Information Retrieval With Applications to Search in Conversational Social Media

by Jonathan L. Elsas
"... Social media collections are becoming increasingly important in the everyday life of Internet users. Recent statistics show that sites hosting social media and community-generated content account for five of the top ten most visited websites in the United States [4], are visited regularly by a broad ..."
Abstract - Add to MetaCart
Social media collections are becoming increasingly important in the everyday life of Internet users. Recent statistics show that sites hosting social media and community-generated content account for five of the top ten most visited websites in the United States [4], are visited regularly by a broad cross-section of Internet users [61, 67, 115] and host an enormous quantity of information [119, 48, 9]. The increasing importance and size of these collections requires that information retrieval systems pay special attention to these collections, and in particular pay attention to those aspects of social media collections that set them apart from the general web. Social media collections are interesting and challenging from the perspective of information retrieval systems. These collections are dynamic, with content being constantly added, removed and modified. These collections are time-sensitive, with the most recently added content often viewed as the most significant. These collections are richly structured, with authorship information, often threading structure and higher-level topical classifications. Although this type of collection structure is frequently critical for comprehension, it is rarely exploited in retrieval algorithms.
(Show Context)

Citation Context

...s different tasks. Fang et al.’s Discriminative Model Fang et al. take a different approach, taking inspiration from Balog’s Model 2 (Equation 2.8) and modeling the two components as linear functions =-=[50]-=-. The parameterizations of these two linear functions are jointly learned from training data via gradient methods. In contrast to Balog’s generative model, which ranks aggregates by the likelihood of ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University