Results 21 - 30
of
45
Intent-aware search result diversification
- In Proceedings of the 34th ACM SIGIR
, 2011
"... Search result diversification has gained momentum as a way to tackle ambiguous queries. An effective approach to this problem is to explicitly model the possible aspects underlying a query, in order to maximise the estimated relevance of the retrieved documents with respect to the different aspects. ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Search result diversification has gained momentum as a way to tackle ambiguous queries. An effective approach to this problem is to explicitly model the possible aspects underlying a query, in order to maximise the estimated relevance of the retrieved documents with respect to the different aspects. However, such aspects themselves may represent information needs with rather distinct intents (e.g., informational or navigational). Hence, a diverse ranking could benefit from applying intent-aware retrieval models when estimating the relevance of documents to different aspects. In this paper, we propose to diversify the results retrieved for a given query, by learning the appropriateness of different retrieval models for each of the aspects underlying this query. Thorough experiments within the evaluation framework provided by the diversity task of the TREC 2009 and 2010 Web tracks show that the proposed approach can significantly improve state-of-the-art diversification approaches.
Intentional query suggestion: making user goals more explicit during search
- Proceedings of the 2009 workshop on Web Search Click Data
, 2009
"... student.tugraz.at The degree to which users ’ make their search intent explicit can be assumed to represent an upper bound on the level of service that search engines can provide. In a departure from traditional query expansion mechanisms, we introduce Intentional Query Suggestion as a novel idea th ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
student.tugraz.at The degree to which users ’ make their search intent explicit can be assumed to represent an upper bound on the level of service that search engines can provide. In a departure from traditional query expansion mechanisms, we introduce Intentional Query Suggestion as a novel idea that is attempting to make users ’ intent more explicit during search. In this paper, we present a prototypical algorithm for Intentional Query Suggestion and we discuss corresponding data from comparative experiments with traditional query suggestion mechanisms. Our preliminary results indicate that intentional query suggestions 1) diversify search result sets (i.e. it reduces result set overlap) and 2) have the potential to yield higher click-through rates than traditional query suggestions.
Coupling Feature Selection and Machine Learning Methods for Navigational Query Identification
, 2006
"... It is important yet hard to identify navigational queries in Web search due to a lack of sufficient information in Web queries, which are typically very short. In this paper we study several machine learning methods, including naive Bayes model, maximum entropy model, support vector machine (SVM), a ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
It is important yet hard to identify navigational queries in Web search due to a lack of sufficient information in Web queries, which are typically very short. In this paper we study several machine learning methods, including naive Bayes model, maximum entropy model, support vector machine (SVM), and stochastic gradient boosting tree (SGBT), for navigational query identification in Web search. To boost the performance of these machine techniques, we exploit several feature selection methods and propose coupling feature selection with classification approaches to achieve the best performance. Different from most prior work that uses a small number of features, in this paper, we study the problem of identifying navigational queries with thousands of available features, extracted from major commercial search engine results, Web search user click data, query log, and the whole Web’s relational content. A multi-level feature extraction system is constructed. Our results on real search data show that 1) Among all the features we tested, user click distribution features are the most important set of features for identifying navigational queries. 2) In order to achieve good performance, machine learning approaches have to be coupled with good feature selection methods. We find that gradient boosting tree, coupled with linear SVM feature selection is most effective. 3) With carefully coupled feature selection and classification approaches, navigational queries can be accurately identified with 88.1 % F1 score, which is 33 % error rate reduction compared to the best uncoupled system, and 40 % error rate reduction compared to a well tuned system without feature selection.
A decision mechanism for the selective combination of evidence in topic distillation
- Information Retrieval
"... The combination of evidence can increase retrieval effectiveness. In this paper, we investigate the effectiveness of a decision mechanism for the selective combination of evidence for Web Information Retrieval and particularly for topic distillation. We introduce two measures of a query’s broadness ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The combination of evidence can increase retrieval effectiveness. In this paper, we investigate the effectiveness of a decision mechanism for the selective combination of evidence for Web Information Retrieval and particularly for topic distillation. We introduce two measures of a query’s broadness and use them to select an appropriate combination of evidence for each query. The results from our experiments show that there is a statistically significant association between the output of the decision mechanism and the relative effectiveness of the different combinations of evidence. Moreover, we show that the proposed methodology can be applied in an operational setting, where relevance information is not available, by setting the decision mechanism’s thresholds automatically.
Ranking with Query-Dependent Loss for Web Search
"... Queries describe the users ’ search intent and therefore they play an essential role in the context of ranking for information retrieval and Web search. However, most of existing approaches for ranking do not explicitly take into consideration the fact that queries vary significantly along several d ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Queries describe the users ’ search intent and therefore they play an essential role in the context of ranking for information retrieval and Web search. However, most of existing approaches for ranking do not explicitly take into consideration the fact that queries vary significantly along several dimensions and entail different treatments regarding the ranking models. In this paper, we propose to incorporate query difference into ranking by introducing querydependent loss functions. In the context of Web search, query difference is usually represented as different query categories; and, queries are usually classified according to search intent such as navigational, informational and transactional queries. Based on the observation that such kind of query categorization has high correlation with the user’s differentexpectationontheresult accuracy on different rank positions, we develop position-sensitive query-dependent loss functions exploring such kind of query categorization. Beyond the simple learning method that builds ranking functions with pre-defined query categorization, we further propose a new method that learns both ranking functions and query categorization simultaneously. We apply the querydependent loss functions to two particular ranking algorithms, RankNet and ListMLE. Experimental results demonstrate that query-dependent loss functions can be exploited to significantly improve the accuracy of learned ranking functions. We also show that the ranking function jointly learned with query categorization can achieve better performance than that learned with pre-defined query categorization. Finally, we provide analysis and conduct additional experiments to gain deeper understanding on the advantages of ranking with query-dependent loss functions over other querydependent ranking approaches and query-independent approaches.
ICT-DCU question answering task at ntcir6
- In Proceedings of the Sixth NTCIR Workshop on Research in Information Access Technologies
, 2007
"... This paper describes details of our participation in the NTCIR-6 Chinese-to-Chinese Question Answering task. We use the “retrieval plus extraction approach ” to get answers for questions. We first split the documents into short passages, and then retrieve potentially relevant passages for a question ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper describes details of our participation in the NTCIR-6 Chinese-to-Chinese Question Answering task. We use the “retrieval plus extraction approach ” to get answers for questions. We first split the documents into short passages, and then retrieve potentially relevant passages for a question, and finally extract named entity answers from the most relevant passages. For question type identification, we use simple heuristic rules which cover most questions. The Lemur toolkit was used with the okapi model for document retrieval. Results of our task submission are given and some preliminary conclusions drawn. Keywords: NTCIR, Chinese-to-Chinese Question Answering,
A website mining model centered on user queries
- In EWMF 2005
, 2005
"... Abstract. We present a model for mining user queries found within the access logs of a website and for relating this information to the website’s overall usage, structure and content. The aim of this model is to discover, in a simple way, valuable information to improve the quality of the website, a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. We present a model for mining user queries found within the access logs of a website and for relating this information to the website’s overall usage, structure and content. The aim of this model is to discover, in a simple way, valuable information to improve the quality of the website, allowing the website to become more intuitive and adequate for the needs of its users. This model presents a methodology of analysis and classification of the different types of queries registered in the usage logs of a website, such as queries submitted by users to the site’s internal search engine and queries on global search engines that lead to documents in the website. These queries provide useful information about topics that interest users visiting the website and the navigation patterns associated to these queries indicate whether or not the documents in the site satisfied the user’s needs at that moment. 1
Empirical Exploitation of Click Data for Task Specific Ranking
"... There have been increasing needs for task specific rankings in web search such as rankings for specific query segments like long queries, time-sensitive queries, navigational queries, etc; or rankings for specific domains/contents like answers, blogs, news, etc. In the spirit of ”divide-andconquer”, ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
There have been increasing needs for task specific rankings in web search such as rankings for specific query segments like long queries, time-sensitive queries, navigational queries, etc; or rankings for specific domains/contents like answers, blogs, news, etc. In the spirit of ”divide-andconquer”, task specific ranking may have potential advantages over generic ranking since different tasks have task-specific features, data distributions, as well as featuregrade correlations. A critical problem for the task-specific ranking is training data insufficiency, which may be solved by using the data extracted from click log. This paper empirically studies how to appropriately exploit click data to improve rank function learning in task-specific ranking. The main contributions are 1) the exploration on the utilities of two promising approaches for click pair extraction; 2) the analysis of the role played by the noise information which inevitably appears in click data extraction; 3) the appropriate strategy for combining training data and click data; 4) the comparison of click data which are consistent and inconsistent with baseline function. 1
General Terms Algorithms, Experimentation
"... Many searches on the web have a transactional intent. We argue that pages satisfying transactional needs can be distinguished from the more common pages that have some information and links, but cannot be used to execute a transaction. Based on this hypothesis, we provide a recipe for constructing a ..."
Abstract
- Add to MetaCart
Many searches on the web have a transactional intent. We argue that pages satisfying transactional needs can be distinguished from the more common pages that have some information and links, but cannot be used to execute a transaction. Based on this hypothesis, we provide a recipe for constructing a transaction annotator. By constructing an annotator with one corpus and then demonstrating its classification performance on another, we establish its robustness. Finally, we show experimentally that a search procedure that exploits such pre-annotation greatly outperforms traditional search for retrieving transactional pages.

