Results 1 - 10
of
28
Modeling the Impact of Short- and Long-Term Behavior on Search Personalization
"... User behavior provides many cues to improve the relevance of search results through personalization. One aspect of user behavior that provides especially strong signals for delivering better relevance is an individual’s history of queries and clicked documents. Previous studies have explored how sho ..."
Abstract
-
Cited by 30 (9 self)
- Add to MetaCart
(Show Context)
User behavior provides many cues to improve the relevance of search results through personalization. One aspect of user behavior that provides especially strong signals for delivering better relevance is an individual’s history of queries and clicked documents. Previous studies have explored how short-term behavior or long-term behavior can be predictive of relevance. Ours is the first study to assess how short-term (session) behavior and long-term (historic) behavior interact, and how each may be used in isolation or in combination to optimally contribute to gains in relevance through search personalization. Our key findings include: historic behavior provides substantial benefits at the start of a search session; short-term session behavior contributes the majority of gains in an extended search session; and the combination of session and historic behavior out-performs using either alone. We also characterize how the relative contribution of each model changes throughout the duration of a session. Our findings have implications for the design of search systems that leverage user behavior to personalize the search experience.
Probabilistic Models for Personalizing Web Search
"... We present a new approach for personalizing Web search results to a specific user. Ranking functions for Web search engines are typically trained by machine learning algorithms using either direct human relevance judgments or indirect judgments obtained from click-through data from millions of users ..."
Abstract
-
Cited by 22 (7 self)
- Add to MetaCart
(Show Context)
We present a new approach for personalizing Web search results to a specific user. Ranking functions for Web search engines are typically trained by machine learning algorithms using either direct human relevance judgments or indirect judgments obtained from click-through data from millions of users. The rankings are thus optimized to this generic population of users, not to any specific user. We propose a generative model of relevance which can be used to infer the relevance of a document to a specific user for a search query. The user-specific parameters of this generative model constitute a compact user profile. We show how to learn these profiles from a user’s long-term search history. Our algorithm for computing the personalized ranking is simple and has little computational overhead. We evaluate our personalization approach using historical search data from thousands of users of a major Web search engine. Our findings demonstrate gains in retrieval performance for queries with high ambiguity, with particularly large improvements for acronym queries.
Predictive Client-side Profiles for Personalized Advertising
"... Personalization is ubiquitous in modern online applications as it provides significant improvements in user experience by adapting it to inferred user preferences. However, there are increasing concerns related to issues of privacy and control of the user data that is aggregated by online systems to ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
(Show Context)
Personalization is ubiquitous in modern online applications as it provides significant improvements in user experience by adapting it to inferred user preferences. However, there are increasing concerns related to issues of privacy and control of the user data that is aggregated by online systems to power personalized experiences. These concerns are particularly significant for user profile aggregation in online advertising. This paper describes a practical, learning-driven client-side personalization approach for keyword advertising platforms, an emerging application previously not addressed in literature. Our approach relies on storing user-specific information entirely within the user’s control (in a browser cookie or browser local storage), thus allowing the user to view, edit or purge it at any time (e.g., via a dedicated webpage). We
Studies of the Onset and Persistence of Medical Concerns in Search Logs
"... The Web provides a wealth of information about medical symptoms and disorders. Although this content is often valuable to consumers, studies have found that interaction with Web content may heighten anxiety and stimulate healthcare utilization. We present a longitudinal log-based study of medical se ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
(Show Context)
The Web provides a wealth of information about medical symptoms and disorders. Although this content is often valuable to consumers, studies have found that interaction with Web content may heighten anxiety and stimulate healthcare utilization. We present a longitudinal log-based study of medical search and browsing behavior on the Web. We characterize how users focus on particular medical concerns and how concerns persist and influence future behavior, including changes in focus of attention in searching and browsing for health information. We build and evaluate models that predict transitions from searches on symptoms to searches on health conditions, and escalations from symptoms to serious illnesses. We study the influence that the prior onset of concerns may have on future behavior, including sudden shifts back to searching on the concern amidst other searches. Our findings have implications for refining Web search and retrieval to support people pursuing diagnostic information.
To each his own: personalized content selection based on text comprehensibility
- In Proceedings of the 5th ACM International Conference on Web Search and Data Mining
, 2012
"... Imagine a physician and a patient doing a search on antibiotic resistance. Or a chess amateur and a grandmaster conducting a search on Alekhine’s Defence. Although the topic is the same, arguably the two users in each case will satisfy their information needs with very different texts. Yet today sea ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
Imagine a physician and a patient doing a search on antibiotic resistance. Or a chess amateur and a grandmaster conducting a search on Alekhine’s Defence. Although the topic is the same, arguably the two users in each case will satisfy their information needs with very different texts. Yet today search engines mostly adopt the onesize-fits-all solution, where personalization is restricted to topical preference. We found that users do not uniformly prefer simple texts, and that the text comprehensibility level should match the user’s level of preparedness. Consequently, we propose to model the comprehensibility of texts as well as the users ’ reading proficiency in order to better explain how different users choose content for further exploration. We also model topic-specific reading proficiency, which allows us to better explain why a physician might choose to read sophisticated medical articles yet simple descriptions of SLR cameras. We explore different ways to build user profiles, and use collaborative filtering techniques to overcome data sparsity. We conducted experiments on large-scale datasets from a major Web search engine and a community question answering forum. Our findings confirm that explicitly modeling text comprehensibility can significantly improve content ranking (search results or answers, respectively).
Fighting Search Engine Amnesia: Reranking Repeated Results
, 2013
"... Web search engines frequently show the same documents repeatedly for different queries within the same search session, in essence forgetting when the same documents were already shown to users. Depending on previous user interaction with the repeated results, and the details of the session, we show ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
(Show Context)
Web search engines frequently show the same documents repeatedly for different queries within the same search session, in essence forgetting when the same documents were already shown to users. Depending on previous user interaction with the repeated results, and the details of the session, we show that sometimes the repeated results should be promoted, while some other times they should be demoted. Analysing search logs from two different commercial search engines, we find that results are repeated in about 40 % of multi-query search sessions, and that users engage differently with repeats than with results shown for the first time. We demonstrate how statistics about result repetition within search sessions can be incorporated into ranking for personalizing search results. Our results on query logs of two large-scale commercial search engines suggest that we successfully promote documents that are more likely to be clicked by the user in the future while maintaining performance over standard measures of non-personalized relevance.
An exploration of ranking heuristics in mobile local search
- In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’12
, 2012
"... Users increasingly rely on their mobile devices to search local entities, typically businesses, while on the go. Even though recent work has recognized that the ranking signals in mo-bile local search (e.g., distance and customer rating score of a business) are quite different from general Web searc ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Users increasingly rely on their mobile devices to search local entities, typically businesses, while on the go. Even though recent work has recognized that the ranking signals in mo-bile local search (e.g., distance and customer rating score of a business) are quite different from general Web search, they have mostly treated these signals as a black-box to ex-tract very basic features (e.g., raw distance values and rating scores) without going inside the signals to understand how exactly they affect the relevance of a business. However, as it has been demonstrated in the development of general information retrieval models, it is critical to explore the un-derlying behaviors/heuristics of a ranking signal to design more effective ranking features. In this paper, we follow a data-driven methodology to study the behavior of these ranking signals in mobile local search using a large-scale query log. Our analysis reveals interesting heuristics that can be used to guide the exploita-tion of different signals. For example, users often take the mean value of a signal (e.g., rating) from the business result list as a “pivot ” score, and tend to demonstrate different click behaviors on businesses with lower and higher signal values than the pivot; the clickrate of a business generally is sublinearly decreasing with its distance to the user, etc. Inspired by the understanding of these heuristics, we further propose different transformation methods to generate more effective ranking features. We quantify the improvement of the proposed new features using real mobile local search logs over a period of 14 months and show that the mean average precision can be improved by over 7%.
Know Your Personalization: Learning Topic level Personalization in Online Services
"... Online service platforms (OSPs), such as search engines, news-websites, ad-providers, etc., serve highly personalized content to the user, based on the profile extracted from her history with the OSP. Although personalization (generally) leads to a better user experience, it also raises privacy conc ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Online service platforms (OSPs), such as search engines, news-websites, ad-providers, etc., serve highly personalized content to the user, based on the profile extracted from her history with the OSP. Although personalization (generally) leads to a better user experience, it also raises privacy concerns for the user—she does not know what is present in her profile and more importantly, what is being used to personalize her content. In this paper, we capture OSP’s personalization for an user in a new data structure called the personalization vector (η), which is a weighted vector over a set of topics, and present efficient algorithms to learn it. Our approach treats OSPs as black-boxes, and extracts η by mining only their output, specifically, the personalized (for an user) and vanilla (without any user information) contents served, and the differences in these content. We believe that such treatment of OSPs is a unique aspect of our work, not just enabling access to (so far hidden) profiles in OSPs, but also providing a novel and practical approach for retrieving information from OSPs by mining differences in their outputs. We formulate a new model called Latent Topic Personalization (LTP) that captures the personalization vector in a learning framework and present efficient inference algorithms for determining it. We perform extensive experiments targeting search engine personalization, using data from bothreal Google users andsyntheticsetup. Ourresults indicate that LTP achieves high accuracy (R-pre = 84%) in discovering personalized topics. For Google data, our qualitative results demonstrate that the topics determined by LTP for a user correspond well to his ad-categories determined by Google. Categories andSubject Descriptors
Efficient Representation of the Lifelong Web Browsing User Characteristics
"... Abstract. Client-based user modelling has already been studied and clearly has its place among generic approaches to the user modelling. It is especially advantageous for lifelong user modelling as it can support the modelling in any time and any place including consideration of user privacy. Emerge ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Client-based user modelling has already been studied and clearly has its place among generic approaches to the user modelling. It is especially advantageous for lifelong user modelling as it can support the modelling in any time and any place including consideration of user privacy. Emergence of web browser extensions opens up possibilities of pure browser-based realisation of client-based user modelling. In this paper, we focus on the efficient representation of a generic user model inside a web browser, which forms the core part of browser-based user modelling framework in form of a browser extension. Efficiency is crucial also from the lifelong perspective. We propose an efficient method of lifelog indexing and modelling various user characteristics inside the web browser. We evaluated properties of proposed representation and describe its applicability in some common use cases. 1
Mining Search and Browse Logs for Web Search: A Survey
"... Huge amounts of search log data have been accumulated at web search engines. Currently, a popular web search engine may every day receive billions of queries and collect tera-bytes of records about user search behavior. Beside search log data, huge amounts of browse log data have also been collected ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Huge amounts of search log data have been accumulated at web search engines. Currently, a popular web search engine may every day receive billions of queries and collect tera-bytes of records about user search behavior. Beside search log data, huge amounts of browse log data have also been collected through client-side browser plug-ins. Such massive amounts of search and browse log data provide great opportunities for mining the wisdom of crowds and improving web search. At the same time, designing e↵ective and e cient methods to clean, process, and model log data also presents great challenges. In this survey, we focus on mining search and browse log data for web search. We start with an introduction to search and browse log data and an overview of frequently-used data summarizations in log mining. We then elaborate how log mining applications enhance the five major components of a search engine, namely, query understanding, document understanding, document ranking, user understanding, and monitoring & feedbacks. For each aspect, we survey the major tasks, fundamental principles, and state-of-the-art methods.