Results 1  10
of
18
Efficient crowdsourcing of unknown experts using multiarmed bandits
 In ECAI
, 2012
"... Abstract. We address the expert crowdsourcing problem, in which an employer wishes to assign tasks to a set of available workers with heterogeneous working costs. Critically, as workers produce results of varying quality, the utility of each assigned task is unknown and can vary both between workers ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
Abstract. We address the expert crowdsourcing problem, in which an employer wishes to assign tasks to a set of available workers with heterogeneous working costs. Critically, as workers produce results of varying quality, the utility of each assigned task is unknown and can vary both between workers and individual tasks. Furthermore, in realistic settings, workers are likely to have limits on the number of tasks they can perform and the employer will have a fixed budget to spend on hiring workers. Given these constraints, the objective of the employer is to assign tasks to workers in order to maximise the overall utility achieved. To achieve this, we introduce a novel multi–armed bandit (MAB) model, the bounded MAB, that naturally captures the problem of expert crowdsourcing. We also propose an algorithm to solve it efficiently, called bounded ε–first, which uses the first εB of its total budget B to derive estimates of the workers’ quality characteristics (exploration), while the remaining (1 − ε) B is used to maximise the total utility based on those estimates (exploitation). We show that using this technique allows us to derive an
Truthful Incentives in Crowdsourcing Tasks using Regret Minimization Mechanisms
"... What price should be offered to a worker for a task in an online labor market? How can one enable workers to express the amount they desire to receive for the task completion? Designing optimal pricing policies and determining the right monetary incentives is central to maximizing requester’s utilit ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
(Show Context)
What price should be offered to a worker for a task in an online labor market? How can one enable workers to express the amount they desire to receive for the task completion? Designing optimal pricing policies and determining the right monetary incentives is central to maximizing requester’s utility and workers ’ profits. Yet, current crowdsourcing platforms only offer a limited capability to the requester in designing the pricing policies and often rules of thumb are used to price tasks. This limitation could result in inefficient use of the requester’s budget or workers becoming disinterested in the task. In this paper, we address these questions and present mechanisms using the approach of regret minimization in online learning. We exploit a link between procurement auctions and multiarmed bandits to design mechanisms that are budget feasible, achieve nearoptimal utility for the requester, are incentive compatible (truthful) for workers and make minimal assumptions about the distribution of workers’ true costs. Our main contribution is a novel, noregret posted price mechanism, BPUCB, for budgeted procurement in stochastic online settings. We prove strong theoretical guarantees about our mechanism, and extensively evaluate it in simulations as well as on real data from the Mechanical Turk platform. Compared to the state of the art, our approach leads to a 180 % increase in utility.
Automatic Ad Format Selection via Contextual Bandits ABSTRACT
"... Visual design plays an important role in online display advertising: changing the layout of an online ad can increase or decrease its effectiveness, measured in terms of clickthrough rate (CTR) or total revenue. The decision of which layout to use for an ad involves a tradeoff: using a layout prov ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Visual design plays an important role in online display advertising: changing the layout of an online ad can increase or decrease its effectiveness, measured in terms of clickthrough rate (CTR) or total revenue. The decision of which layout to use for an ad involves a tradeoff: using a layout provides feedback about its effectiveness (exploration), but collecting that feedback requires sacrificing the immediate reward of using a layout we already know is effective (exploitation). To balance exploration with exploitation, we pose automatic layout selection as a contextualbandit problem. There are many bandit algorithms, each generating a policy which must be evaluated. It is impractical to test each policy on live traffic. However, we have found that offline replay (a.k.a. exploration scavenging) can be adapted to provide an accurate estimator for the performance of ad layout policies at Linkedin, using only historical data about the effectiveness of layouts. We describe the development of our offline replayer, and benchmark a number of common bandit algorithms.
MultiArmed Bandit with Budget Constraint and Variable Costs
"... We study the multiarmed bandit problems with budget constraint and variable costs (MABBV). In this setting, pulling an arm will receive a random reward together with a random cost, and the objective of an algorithm is to pull a sequence of arms in order to maximize the expected total reward with t ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
We study the multiarmed bandit problems with budget constraint and variable costs (MABBV). In this setting, pulling an arm will receive a random reward together with a random cost, and the objective of an algorithm is to pull a sequence of arms in order to maximize the expected total reward with the costs of pulling those arms complying with a budget constraint. This new setting models many Internet applications (e.g., ad exchange, sponsored search, and cloud computing) in a more accurate manner than previous settings where the pulling of arms is either costless or with a fixed cost. We propose two UCB based algorithms for the new setting. The first algorithm needs prior knowledge about the lower bound of the expected costs when computing the exploration term. The second algorithm eliminates this need by estimating the minimal expected costs from empirical observations, and therefore can be applied to more realworld applications where prior knowledge is not available. We prove that both algorithms have nice learning abilities, with regret bounds of O(ln B). Furthermore, we show that when applying our proposed algorithms to a previous setting with fixed costs (which can be regarded as our special case), one can improve the previously obtained regret bound. Our simulation results on realtime bidding in ad exchange verify the effectiveness of the algorithms and are consistent with our theoretical analysis.
Bandits with knapsacks: Dynamic procurement for crowdsourcing
 In The 3rd Workshop on Social Computing and User Generated Content, colocated with ACM EC
, 2013
"... Abstract In a basic version of the dynamic procurement problem, the algorithm has a budget B to spend, and is facing n agents (potential sellers) that are arriving sequentially. The algorithm offers a takeitorleaveit price to each arriving seller; the sellers value for an item is an independent ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract In a basic version of the dynamic procurement problem, the algorithm has a budget B to spend, and is facing n agents (potential sellers) that are arriving sequentially. The algorithm offers a takeitorleaveit price to each arriving seller; the sellers value for an item is an independent sample from some fixed (but unknown) distribution. The goal is to maximize the number of items bought. This problem is particularly relevant to the emerging domain of crowdsourcing, where agents correspond to the (relatively inexpensive) workers on a crowdsourcing platform such as Amazon Mechanical Turk, and "items" bought/sold correspond to simple jobs ("microtasks") that can be performed by these workers. The algorithm corresponds to the "client": an entity that submits jobs and benefits from them being completed. The basic formulation admits various generalizations, e.g. to multiple job types. We also address an alternative model in which the requester posts offers to the entire crowd. We model the dynamic procurement problems as multiarmed bandit problems with a budget constraint. We define "bandits with knapsacks": a broad class of multiarmed bandit problems with knapsackstyle resourceutilization constraints which subsumes dynamic procurement and a host of other applications. A distinctive feature of our problem, in comparison to the existing regretminimization literature, is that the optimal policy for a given latent distribution may significantly outperform the policy that plays the optimal fixed arm. Consequently, achieving sublinear regret in the banditswithknapsacks problem is significantly more challenging than in conventional bandit problems. Our main result is an algorithm for a version of banditswithknapsacks with finitely many possible actions. It is a primaldual algorithm with multiplicative updates; the regret of this algorithm is close to the informationtheoretic optimum. We derive corollaries for dynamic procurement using uniform discretization of prices. * This is a refocused and shortened version of a paper which is under submission. That paper, titled "Bandits with Knapsacks"
Research Statement
"... My research interest is in the design and analysis of algorithms for optimization. I am strongly motivated by applications, particularly in machine learning and the design of electronic markets. As a theoretician, I believe in formulating problems that are fundamental to these applications and yet s ..."
Abstract
 Add to MetaCart
(Show Context)
My research interest is in the design and analysis of algorithms for optimization. I am strongly motivated by applications, particularly in machine learning and the design of electronic markets. As a theoretician, I believe in formulating problems that are fundamental to these applications and yet sufficiently general to be applicable to a wide variety of domains. This has led me to focus on two areas, sequential decision making and discrete nonlinear optimization, introducing broad new problem formulations and solving them by novel algorithmic techniques. We are surrounded by problems where we need to make decisions without having some or all of the relevant information. However, we can learn from the results of our past actions. Examples of such problems are learning clickthrough rates of advertisements or learning the effectiveness of drugs during testing. My research focuses on this theme of sequential decision making and its applications to machine learning and algorithmic economics. Discrete optimization is at the center stage of algorithms and has applications to different areas of computer science. A burst of activity in applying it to real world problems has happened recently because of models which deal with nonlinearity. An example is the sensor placement problem where the total area covered by the sensors depends on their locations in a nonlinear manner. My research