Results 1 - 10
of
167
Searching The Web: The Public and Their Queries
, 2001
"... In studying actual Web searching by the public at large, we analyzed over one million Web queries by users of the Excite search engine. We found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features. A small number of search terms ar ..."
Abstract
-
Cited by 188 (7 self)
- Add to MetaCart
In studying actual Web searching by the public at large, we analyzed over one million Web queries by users of the Excite search engine. We found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features. A small number of search terms are used with high frequency, and a great many terms are unique; the language of Web queries is distinctive. Queries about recreation and entertainment rank highest. Findings are compared to data from two other large studies of Web queries. This study provides an insight into the public practices and choices in Web searching.
Automatic Identification of User Goals in Web Search
, 2005
"... There have been recent interests in studying the "goal" behind a user's Web query, so that this goal can be used to improve the quality of a search engine's results. Previous studies have mainly focused on using manual query-log investigation to identify Web query goals. In this paper we study wheth ..."
Abstract
-
Cited by 86 (2 self)
- Add to MetaCart
There have been recent interests in studying the "goal" behind a user's Web query, so that this goal can be used to improve the quality of a search engine's results. Previous studies have mainly focused on using manual query-log investigation to identify Web query goals. In this paper we study whether and how we can automate this goal-identification process. We first present our results from a human subject study that strongly indicate the feasibility of automatic query-goal identification. We then propose two types of features for the goal-identification task: user-click behavior and anchor-link distribution. Our experimental evaluation shows that by combining these features we can correctly identify the goals for 90% of the queries studied.
Hourly analysis of a very large topically categorized web query log
- In SIGIR ’04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 2004
"... We review a query log of hundreds of millions of queries that constitute the total query traffic for an entire week of a generalpurpose commercial web search service. Previously, query logs have been studied from a single, cumulative view. In contrast, our analysis shows changes in popularity and un ..."
Abstract
-
Cited by 81 (8 self)
- Add to MetaCart
We review a query log of hundreds of millions of queries that constitute the total query traffic for an entire week of a generalpurpose commercial web search service. Previously, query logs have been studied from a single, cumulative view. In contrast, our analysis shows changes in popularity and uniqueness of topically categorized queries across the hours of the day. We examine query traffic on an hourly basis by matching it against lists of queries that have been topically pre-categorized by human editors. This represents 13 % of the query traffic. We show that query traffic from particular topical categories differs both from the query stream as a whole and from other categories. This analysis provides valuable insight for improving retrieval effectiveness and efficiency. It is also relevant to the development of enhanced query disambiguation, routing, and caching algorithms.
A Review of Web Searching Studies and a Framework for Future Research
, 2000
"... Research on Web searching is at an incipient stage. This aspect provides a unique opportunity to review the current state of research in the field, identify common trends, develop a methodological framework, and define terminology for future Web searching studies. In this article, the results from p ..."
Abstract
-
Cited by 74 (0 self)
- Add to MetaCart
Research on Web searching is at an incipient stage. This aspect provides a unique opportunity to review the current state of research in the field, identify common trends, develop a methodological framework, and define terminology for future Web searching studies. In this article, the results from published studies of Web searching are reviewed in order to present the current state of research. The analysis of the limited Web searching studies available indicates that research methods and terminology are already diverging. A framework is proposed for future studies that will facilitate comparison of results. The advantages of such a framework are presented, and the implications for the design of Web information retrieval systems studies are discussed. Additionally, the searching characteristics of Web users are compared and contrasted with users of traditional information retrieval and online public access systems to discover if there is a need for more studies that focus predominantly or exclusively on Web searching. The comparison indicates that Web searching differs from searching in other environments.
Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search
- ACM TRANSACTIONS ON INFORMATION SCIENCE (TOIS
, 2007
"... This paper examines the reliability of implicit feedback generated from clickthrough data and query reformulations in WWW search. Analyzing the users ’ decision process using eyetracking and comparing implicit feedback against manual relevance judgments, we conclude that clicks are informative but b ..."
Abstract
-
Cited by 64 (8 self)
- Add to MetaCart
This paper examines the reliability of implicit feedback generated from clickthrough data and query reformulations in WWW search. Analyzing the users ’ decision process using eyetracking and comparing implicit feedback against manual relevance judgments, we conclude that clicks are informative but biased. While this makes the interpretation of clicks as absolute relevance judgments difficult, we show that relative preferences derived from clicks are reasonably accurate on average. We find that such relative preferences are accurate not only between results from an individual query, but across multiple sets of results within chains of query reformulations.
Measuring Search Engine Quality
, 2001
"... The effectiveness of twenty public search engines is evaluated using TREC-inspired methods and a set of 54 queries taken from real Web search logs. The World Wide Web is taken as the test collection and a combination of crawler and text retrieval system is evaluated. The engines are compared on a ..."
Abstract
-
Cited by 47 (8 self)
- Add to MetaCart
The effectiveness of twenty public search engines is evaluated using TREC-inspired methods and a set of 54 queries taken from real Web search logs. The World Wide Web is taken as the test collection and a combination of crawler and text retrieval system is evaluated. The engines are compared on a range of measures derivable from binary relevance judgments of the first seven live results returned. Statistical testing reveals a significant difference between engines and high inter-correlations between measures. Surprisingly, given the dynamic nature of the Web and the time elapsed, there is also a high correlation between results of this study and a previous study by Gordon and Pathak. For nearly all engines, there is a gradual decline in precision at increasing cutoff after some initial fluctuation. Performance of the engines as a group is found to be inferior to the group of participants in the TREC-8 Large Web task, although the best engines approach the median of those systems. Shortcomings of current Web search evaluation methodology are identified and recommendations are made for future improvements. In particular, the present study and its predecessors deal with queries which are assumed to derive from a need to find a selection of documents relevant to a topic. By contrast, real Web search reflects a range of other information need types which require different judging and different measures. The authors wish to acknowledge that this work was carried out partly within the Cooperative Research Centre for Advanced Computational Systems established under the Australian Government's Cooperative Research Centres Program. 1 1
Studying the use of popular destinations to enhance Web search interaction
- ACM SIGIR '07. ACM
, 2007
"... We present a novel Web search interaction feature which, for a given query, provides links to websites frequently visited by other users with similar information needs. These popular destinations complement traditional search results, allowing direct navigation to authoritative resources for the que ..."
Abstract
-
Cited by 44 (10 self)
- Add to MetaCart
We present a novel Web search interaction feature which, for a given query, provides links to websites frequently visited by other users with similar information needs. These popular destinations complement traditional search results, allowing direct navigation to authoritative resources for the query topic. Destinations are identified using the history of search and browsing behavior of many users over an extended time period, whose collective behavior provides a basis for computing source authority. We describe a user study which compared the suggestion of destinations with the previously proposed suggestion of related queries, as well as with traditional, unaided Web search. Results show that search enhanced by destination suggestions outperforms other systems for exploratory tasks, with best performance obtained from mining past user behavior at query-level granularity.
Term proximity scoring for keyword-based retrieval systems
- In Proc. of the 25th European Conf. on IR Research
, 2003
"... Abstract. This paper suggests the use of proximity measurement in combination with the Okapi probabilistic model. First, using the Okapi system, our investigation was carried out in a distributed retrieval framework to calculate the same relevance score as that achieved by a single centralized index ..."
Abstract
-
Cited by 42 (2 self)
- Add to MetaCart
Abstract. This paper suggests the use of proximity measurement in combination with the Okapi probabilistic model. First, using the Okapi system, our investigation was carried out in a distributed retrieval framework to calculate the same relevance score as that achieved by a single centralized index. Second, by applying a term-proximity scoring heuristic to the top documents returned by a keyword-based system, our aim is to enhance retrieval performance. Our experiments were conducted using the TREC8, TREC9 and TREC10 test collections, and show that the suggested approach is stable and generally tends to improve retrieval effectiveness especially at the top documents retrieved. 1
Interactive Internet search: Keyword, directory and query reformulation mechanisms compared
- In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 2000
"... This article compares search effectiveness when using query-based Internet search (via the Google search engine), directory-based search (via Yahoo) and phrasebased query reformulation assisted search (via the Hyperindex browser) by means of a controlled, userbased experimental study. The focus ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
This article compares search effectiveness when using query-based Internet search (via the Google search engine), directory-based search (via Yahoo) and phrasebased query reformulation assisted search (via the Hyperindex browser) by means of a controlled, userbased experimental study. The focus was to evaluate aspects of the search process. Cognitive load was measured using a secondary digit-monitoring task to quantify the effort of the user in various search states; independent relevance judgements were employed to gauge the quality of the documents accessed during the search process. Time was monitored in various search states. Results indicated the directory-based search does not offer increased relevance over the query-based search (with or without query formulation assistance), and also takes longer. Query reformulation does significantly improve the relevance of the documents through which the user must trawl versus standard query-based internet search. However,...
Mining Anchor Text for Query Refinement
- WWW2004
, 2004
"... When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive process of query modification that can be used to narrow down the scope of search results. We propose a new method for automat ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
When searching large hypertext document collections, it is often possible that there are too many results available for ambiguous queries. Query refinement is an interactive process of query modification that can be used to narrow down the scope of search results. We propose a new method for automatically generating refinements or related terms to queries by mining anchor text for a large hypertext document collection. We show that the usage of anchor text as a basis for query refinement produces high quality refinement suggestions that are significantly better in terms of perceived usefulness compared to refinements that are derived using the document content. Furthermore, our study suggests that anchor text refinements can also be used to augment traditional query refinement algorithms based on query logs, since they typically differ in coverage and produce different refinements. Our results are based on experiments on an anchor text collection of a large corporate intranet.

