Results 1 - 10
of
43
Optimizing relevance and revenue in ad search: A query substitution approach
- In SIGIR’08
, 2008
"... The primary business model behind Web search is based on textual advertising, where contextually relevant ads are displayed alongside search results. We address the problem of selecting these ads so that they are both relevant to the queries and profitable to the search engine, showing that optimizi ..."
Abstract
-
Cited by 45 (14 self)
- Add to MetaCart
(Show Context)
The primary business model behind Web search is based on textual advertising, where contextually relevant ads are displayed alongside search results. We address the problem of selecting these ads so that they are both relevant to the queries and profitable to the search engine, showing that optimizing ad relevance and revenue is not equivalent. Selecting the best ads that satisfy these constraints also naturally incurs high computational costs, and time constraints can lead to reduced relevance and profitability. We propose a novel two-stage approach, which conducts most of the analysis ahead of time. An offline preprocessing phase leverages additional knowledge that is impractical to use in real time, and rewrites frequent queries in a way that subsequently facilitates fast and accurate online matching. Empirical evaluation shows that our method optimized for relevance matches a state-of-the-art method while improving expected revenue. When optimizing for revenue, we see even more substantial improvements in expected revenue.
Online expansion of rare queries for sponsored search
- In Proceedings of the 18th International World Wide Web Conference
, 2009
"... Sponsored search systems are tasked with matching queries to relevant advertisements. The current state-of-the-art matching algorithms expand the user’s query using a variety of external resources, such as Web search results. While these expansion-based algorithms are highly effective, they are larg ..."
Abstract
-
Cited by 42 (11 self)
- Add to MetaCart
(Show Context)
Sponsored search systems are tasked with matching queries to relevant advertisements. The current state-of-the-art matching algorithms expand the user’s query using a variety of external resources, such as Web search results. While these expansion-based algorithms are highly effective, they are largely inefficient and cannot be applied in real-time. In practice, such algorithms are applied offline to popular queries, with the results of the expensive operations cached for fast access at query time. In this paper, we describe an efficient and effective approach for matching ads against rare queries that were not processed offline. The approach builds an expanded query representation by leveraging offline processing done for related popular queries. Our experimental results show that our approach significantly improves the effectiveness of advertising on rare queries with only a negligible increase in computational cost.
Predicting bounce rates in sponsored search advertisements
- In SIGKDD Conference on Knowledge Discovery and Data Mining (KDD
, 2009
"... This paper explores an important and relatively unstudied quality measure of a sponsored search advertisement: bounce rate. The bounce rate of an ad can be informally defined as the fraction of users who click on the ad but almost immediately move on to other tasks. A high bounce rate can lead to po ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
(Show Context)
This paper explores an important and relatively unstudied quality measure of a sponsored search advertisement: bounce rate. The bounce rate of an ad can be informally defined as the fraction of users who click on the ad but almost immediately move on to other tasks. A high bounce rate can lead to poor advertiser return on investment, and suggests search engine users may be having a poor experience following the click. In this paper, we first provide quantitative analysis showing that bounce rate is an effective measure of user satisfaction. We then address the question, can we predict bounce rate by analyzing the features of the advertisement? An affirmative answer would allow advertisers and search engines to predict the effectiveness and quality of advertisements before they are shown. We propose solutions to this problem involving large-scale learning methods that leverage features drawn from ad creatives in addition
Improving Ad Relevance in Sponsored Search
"... We describe a machine learning approach for predicting sponsored search ad relevance. Our baseline model incorporates basic features of text overlap and we then extend the model to learn from past user clicks on advertisements. We present a novel approach using translation models to learn user click ..."
Abstract
-
Cited by 25 (4 self)
- Add to MetaCart
We describe a machine learning approach for predicting sponsored search ad relevance. Our baseline model incorporates basic features of text overlap and we then extend the model to learn from past user clicks on advertisements. We present a novel approach using translation models to learn user click propensity from sparse click logs. Our relevance predictions are then applied to multiple sponsored search applications in both offline editorial evaluations and live online user tests. The predicted relevance score is used to improve the quality of the search page in three areas: filtering low quality ads, more accurate ranking for ads, and optimized page placement of ads to reduce prominent placement of low relevance ads. We show significant gains across all three tasks.
Context-sensitive query auto-completion
- In WWW ’11
, 2011
"... ABSTRACT Query auto completion is known to provide poor predictions of the user's query when her input prefix is very short (e.g., one or two characters). In this paper we show that context, such as the user's recent queries, can be used to improve the prediction quality considerably even ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT Query auto completion is known to provide poor predictions of the user's query when her input prefix is very short (e.g., one or two characters). In this paper we show that context, such as the user's recent queries, can be used to improve the prediction quality considerably even for such short prefixes. We propose a context-sensitive query auto completion algorithm, NearestCompletion, which outputs the completions of the user's input that are most similar to the context queries. To measure similarity, we represent queries and contexts as high-dimensional term-weighted vectors and resort to cosine similarity. The mapping from queries to vectors is done through a new query expansion technique that we introduce, which expands a query by traversing the query recommendation tree rooted at the query. In order to evaluate our approach, we performed extensive experimentation over the public AOL query log. We demonstrate that when the recent user's queries are relevant to the current query she is typing, then after typing a single character, NearestCompletion's MRR is 48% higher relative to the MRR of the standard MostPopularCompletion algorithm on average. When the context is irrelevant, however, NearestCompletion's MRR is essentially zero. To mitigate this problem, we propose HybridCompletion, which is a hybrid of NearestCompletion with MostPopularCompletion. HybridCompletion is shown to dominate both NearestCompletion and MostPopularCompletion, achieving a total improvement of 31.5% in MRR relative to MostPopularCompletion on average.
Learning Discriminative Projections for Text Similarity Measures
"... Traditional text similarity measures consider each term similar only to itself and do not model semantic relatedness of terms. We propose a novel discriminative training method that projects the raw term vectors into a common, low-dimensional vector space. Our approach operates by finding the optima ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
(Show Context)
Traditional text similarity measures consider each term similar only to itself and do not model semantic relatedness of terms. We propose a novel discriminative training method that projects the raw term vectors into a common, low-dimensional vector space. Our approach operates by finding the optimal matrix to minimize the loss of the pre-selected similarity function (e.g., cosine) of the projected vectors, and is able to efficiently handle a large number of training examples in the highdimensional space. Evaluated on two very different tasks, cross-lingual document retrieval and ad relevance measure, our method not only outperforms existing state-of-the-art approaches, but also achieves high accuracy at low dimensions and is thus more efficient. 1
Using landing pages for sponsored search ad selection
- In WWW ’10: Proceedings of the 19th international conference on World wide web
, 2010
"... We explore the use of the landing page content in sponsored search ad selection. Specifically, we compare the use of the ad’s intrinsic content to augmenting the ad with the whole, or parts, of the landing page. We explore two types of extractive summarization techniques to select useful regions fro ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
(Show Context)
We explore the use of the landing page content in sponsored search ad selection. Specifically, we compare the use of the ad’s intrinsic content to augmenting the ad with the whole, or parts, of the landing page. We explore two types of extractive summarization techniques to select useful regions from the landing pages: out-of-context and in-context methods. Out-of-context methods select salient regions from the landing page by analyzing the content alone, without taking into account the ad associated with the landing page. In-context methods use the ad context (including its title, creative, and bid phrases) to help identify regions of the landing page that should be used by the ad selection engine. In addition, we introduce a simple yet effective unsupervised algorithm to enrich the ad context to further improve the ad selection. Experimental evaluation confirms that the use of landing pages can significantly improve the quality of ad selection. We also find that our extractive summarization techniques reduce the size of landing pages substantially, while retaining or even improving the performance of ad retrieval over the method that utilize the entire landing page.
Automatic Generation of Bid Phrases for Online Advertising
"... One of the most prevalent online advertising methods is textual advertising. To produce a textual ad, an advertiser must craft a short creative (the text of the ad) linking to a landing page, which describes the product or service being promoted. Furthermore, the advertiser must associate the creati ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
(Show Context)
One of the most prevalent online advertising methods is textual advertising. To produce a textual ad, an advertiser must craft a short creative (the text of the ad) linking to a landing page, which describes the product or service being promoted. Furthermore, the advertiser must associate the creative to a set of manually chosen bid phrases representing those Web search queries that should trigger the ad. For efficiency, given a landing page, the bid phrases are often chosen first, and then for each bid phrase the creative is produced using a template. Nevertheless, an ad campaign (e.g., for a large retailer) might involve thousands of landing pages and tens or hundreds of thousands of bid phrases, hence the entire process is very laborious. Our study aims towards the automatic construction of online ad campaigns: given a landing page, we propose several algorithmic methods to generate bid phrases suitable for the given input. Such phrases must be both relevant (that is, reflect the content of the page) and well-formed (that is, likely to be used as queries to a Web search engine). To this end, we use a two phase approach. First, candidate bid phrases are generated by a number of methods, including a (monolingual) translation model capable of generating phrases not contained within the text of the input as well as previously “unseen ” phrases. Second, the candidates are ranked in a probabilistic framework using both the translation model, which favors relevant phrases, as well as a bid phrase language model, which favors well-formed phrases. Empirical evaluation based on a real-life corpus of advertisercreated landing pages and associated bid phrases confirms the value of our approach, which successfully re-generates many of the human-crafted bid phrases and performs significantly better than a pure text extraction method. The research described herein was conducted while the first author was a summer intern at Yahoo! Research.
Evaluation Strategies for Top-k Queries over Memory-Resident Inverted Indexes
"... Top-k retrieval over main-memory inverted indexes is at the core of many modern applications: from large scale web search and advertising platforms, to text extenders and content management systems. In these systems, queries are evaluated using two major families of algorithms: documentat-a-time (DA ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
(Show Context)
Top-k retrieval over main-memory inverted indexes is at the core of many modern applications: from large scale web search and advertising platforms, to text extenders and content management systems. In these systems, queries are evaluated using two major families of algorithms: documentat-a-time (DAAT) and term-at-a-time (TAAT). DAAT and TAAT algorithms have been studied extensively in the research literature, but mostly in disk-based settings. In this paper, we present an analysis and comparison of several DAAT and TAAT algorithms used in Yahoo!’s production platform for online advertising. The low-latency requirements of online advertising systems mandate memory-resident indexes. We compare the performance of several query evaluation algorithms using two real-world ad selection datasets and query workloads. We show how some adaptations of the original algorithms for main memory setting have yielded significant performance improvement, reducing running time and cost of serving by 60 % in some cases. In these results both the original and the adapted algorithms have been evaluated over memory-resident indexes, so the improvements are algorithmic and not due to the fact that the experiments used main memory indexes. 1.
Classifying Search Queries Using the Web as a Source of Knowledge ∗
"... We propose a methodology for building a robust query classification system that can identify thousands of query classes, while dealing in real-time with the query volume of a commercial Web search engine. We use a pseudo relevance feedback technique: given a query, we determine its topic by classify ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
(Show Context)
We propose a methodology for building a robust query classification system that can identify thousands of query classes, while dealing in real-time with the query volume of a commercial Web search engine. We use a pseudo relevance feedback technique: given a query, we determine its topic by classifying the Web search results retrieved by the query. Motivated by the needs of search advertising, we primarily focus on rare queries, which are the hardest from the point of view of machine learning, yet in aggregation account for a considerable fraction of search engine traffic. Empirical evaluation confirms that our methodology yields a considerably higher classification accuracy than previously reported. We believe that the proposed methodology will lead to better matching of online ads to rare queries and overall to a better user experience.