Results 1  10
of
29
Intrinsic Robustness of the Price of Anarchy
 STOC'09
, 2009
"... The price of anarchy (POA) is a worstcase measure of the inefficiency of selfish behavior, defined as the ratio of the objective function value of a worst Nash equilibrium of a game and that of an optimal outcome. This measure implicitly assumes that players successfully reach some Nash equilibrium ..."
Abstract

Cited by 101 (12 self)
 Add to MetaCart
(Show Context)
The price of anarchy (POA) is a worstcase measure of the inefficiency of selfish behavior, defined as the ratio of the objective function value of a worst Nash equilibrium of a game and that of an optimal outcome. This measure implicitly assumes that players successfully reach some Nash equilibrium. This drawback motivates the search for inefficiency bounds that apply more generally to weaker notions of equilibria, such as mixed Nash and correlated equilibria; or to sequences of outcomes generated by natural experimentation strategies, such as successive best responses or simultaneous regretminimization. We prove a general and fundamental connection between the price of anarchy and its seemingly stronger relatives in classes of games with a sum objective. First, we identify a “canonical sufficient condition ” for an upper bound of the POA for pure Nash equilibria, which we call a smoothness argument. Second, we show that every bound derived via a smoothness argument extends automatically, with no quantitative degradation in the bound, to mixed Nash equilibria, correlated equilibria, and the average objective function value of regretminimizing players (or “price of total anarchy”). Smoothness arguments also have automatic implications for the inefficiency of approximate and BayesianNash equilibria and, under mild additional assumptions, for bicriteria bounds and for polynomiallength bestresponse sequences. We also identify classes of games — most notably, congestion games with cost functions restricted to an arbitrary fixed set — that are tight, in the sense that smoothness arguments are guaranteed to produce an optimal worstcase upper bound on the POA, even for the smallest set of interest (pure Nash equilibria). Byproducts of our proof of this result include the first tight bounds on the POA in congestion games with nonpolynomial cost functions, and the first
Routing without regret: On convergence to nash equilibria of regretminimizing algorithms in routing games
 In PODC
, 2006
"... Abstract There has been substantial work developing simple, efficient noregret algorithms for a wideclass of repeated decisionmaking problems including online routing. These are adaptive strategies an individual can use that give strong guarantees on performance even in adversariallychanging envi ..."
Abstract

Cited by 58 (7 self)
 Add to MetaCart
Abstract There has been substantial work developing simple, efficient noregret algorithms for a wideclass of repeated decisionmaking problems including online routing. These are adaptive strategies an individual can use that give strong guarantees on performance even in adversariallychanging environments. There has also been substantial work on analyzing properties of Nash equilibria in routing games. In this paper, we consider the question: if each player in a routing game uses a noregret strategy, will behavior converge to a Nash equilibrium? In general games the answer to this question is known to be no in a strong sense, but routing games havesubstantially more structure. In this paper we show that in the Wardrop setting of multicommodity flow and infinitesimalagents, behavior will approach Nash equilibrium (formally, on most days, the cost of the flow will be close to the cost of the cheapest paths possible given that flow) at a rate that dependspolynomially on the players ' regret bounds and the maximum slope of any latency function. We also show that priceofanarchy results may be applied to these approximate equilibria, and alsoconsider the finitesize (noninfinitesimal) loadbalancing model of Azar [2].
Opportunistic Spectrum Access with Multiple Users: Learning under Competition
"... The problem of cooperative allocation among multiple secondary users to maximize cognitive system throughput is considered. The channel availability statistics are initially unknown to the secondary users and are learnt via sensing samples. Two distributed learning and allocation schemes which maxi ..."
Abstract

Cited by 52 (1 self)
 Add to MetaCart
The problem of cooperative allocation among multiple secondary users to maximize cognitive system throughput is considered. The channel availability statistics are initially unknown to the secondary users and are learnt via sensing samples. Two distributed learning and allocation schemes which maximize the cognitive system throughput or equivalently minimize the total regret in distributed learning and allocation are proposed. The first scheme assumes minimal prior information in terms of preallocated ranks for secondary users while the second scheme is fully distributed and assumes no such prior information. The two schemes have sum regret which is provably logarithmic in the number of sensing time slots. A lower bound is derived for any learning scheme which is asymptotically logarithmic in the number of slots. Hence, our schemes achieve asymptotic order optimality in terms of regret in distributed learning and allocation.
Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret
"... The problem of distributed learning and channel access is considered in a cognitive network with multiple secondary users. The availability statistics of the channels are initially unknown to the secondary users and are estimated using sensing decisions. There is no explicit information exchange or ..."
Abstract

Cited by 40 (0 self)
 Add to MetaCart
The problem of distributed learning and channel access is considered in a cognitive network with multiple secondary users. The availability statistics of the channels are initially unknown to the secondary users and are estimated using sensing decisions. There is no explicit information exchange or prior agreement among the secondary users. We propose policies for distributed learning and access which achieve orderoptimal cognitive system throughput (number of successful secondary transmissions) under self play, i.e., when implemented at all the secondary users. Equivalently, our policies minimize the regret in distributed learning and access. We first consider the scenario when the number of secondary users is known to the policy, and prove that the total regret is logarithmic in the number of transmission slots. Our distributed learning and access policy achieves orderoptimal regret by comparing to an asymptotic lower bound for regret under any uniformlygood learning and access policy. We then consider the case when the number of secondary users is fixed but unknown, and is estimated through feedback. We propose a policy in this scenario whose asymptotic sum regret which grows slightly faster than logarithmic in the number of transmission slots.
Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior
"... Many natural games can have a dramatic difference between the quality of their best and worst Nash equilibria, even in pure strategies. Yet, nearly all work to date on dynamics shows only convergence to some equilibrium, especially within a polynomial number of steps. In this work we study how age ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
(Show Context)
Many natural games can have a dramatic difference between the quality of their best and worst Nash equilibria, even in pure strategies. Yet, nearly all work to date on dynamics shows only convergence to some equilibrium, especially within a polynomial number of steps. In this work we study how agents with some knowledge of the game might be able to quickly (within a polynomial number of steps) find their way to states of quality close to the best equilibrium. We consider two natural learning models in which players choose between greedy behavior and following a proposed good but untrusted strategy and analyze two important classes of games in this context, fair costsharing and consensus games. Both games have extremely high Price of Anarchy and yet we show that behavior in these models can efficiently reach lowcost states.
On the Inefficiency Ratio of Stable Equilibria in Congestion Games
"... Price of anarchy and price of stability are the primary notions for measuring the efficiency (i.e. the social welfare) of the outcome of a game. Both of these notions focus on extreme cases: one is defined as the inefficiency ratio of the worstcase equilibrium and the other as the best one. Therefo ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
(Show Context)
Price of anarchy and price of stability are the primary notions for measuring the efficiency (i.e. the social welfare) of the outcome of a game. Both of these notions focus on extreme cases: one is defined as the inefficiency ratio of the worstcase equilibrium and the other as the best one. Therefore, studying these notions often results in discovering equilibria that are not necessarily the most likely outcomes of the dynamics of selfish and noncoordinating agents. The current paper studies the inefficiency of the equilibria that are most stable in the presence of noise. In particular, we study two variations of noncooperative games: atomic congestion games and selfish load balancing. The noisy bestresponse dynamics in these games keeps the joint action profile around a particular set of equilibria that minimize the potential function. The inefficiency ratio in the neighborhood of these “stable ” equilibria is much better than the price of anarchy. Furthermore, the dynamics reaches these equilibria in polynomial time. Our observations show that in the game environments where a small noise is present, the system as a whole works better than what a pessimist may predict. They also suggest that in congestion games, introducing a small noise in the payoff of the agents may improve the social welfare.
Beyond the Nash equilibrium barrier
 In Proc. of ICS
, 2011
"... Nash equilibrium analysis has become the de facto standard for judging the solution quality achieved in systems composed of selfish users. This mindset is so pervasive in computer science that even the few papers devoted to directly analyzing outcomes of dynamic processes in repeated games (e.g., be ..."
Abstract

Cited by 9 (8 self)
 Add to MetaCart
(Show Context)
Nash equilibrium analysis has become the de facto standard for judging the solution quality achieved in systems composed of selfish users. This mindset is so pervasive in computer science that even the few papers devoted to directly analyzing outcomes of dynamic processes in repeated games (e.g., bestresponse or noregret learning dynamics) have focused on showing that the performance of these dynamics is comparable to that of Nash equilibria. By assuming that equilibria are representative of the outcomes of selfish behavior, do we ever reach qualitatively wrong conclusions about those outcomes? In this paper, we argue that there exist games whose equilibria represent unnatural outcomes that are hard to coordinate on, and that the solution quality achieved by selfish users in such games is more accurately reflected in the disequilibrium represented by dynamics such as those produced by natural families of online learning algorithms. We substantiate this viewpoint by studying a game with a unique Nash equilibrium, but where natural learning dynamics exhibit nonconvergent cycling behavior rather than converging to this equilibrium. We show that the outcome of this learning process is optimal and has much better social welfare than the unique Nash equilibrium, dramatically illustrating that natural learning processes have the potential to significantly outperform equilibriumbased analysis.
Load Balancing Without Regret in the Bulletin Board Model
"... We analyze the performance of protocols for load balancing in distributed systems based on noregret algorithms from online learning theory. These protocols treat load balancing as a repeated game and apply algorithms whose average performance over time is guaranteed to match or exceed the average p ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
(Show Context)
We analyze the performance of protocols for load balancing in distributed systems based on noregret algorithms from online learning theory. These protocols treat load balancing as a repeated game and apply algorithms whose average performance over time is guaranteed to match or exceed the average performance of the best strategy in hindsight. Our approach captures two major aspects of distributed systems. First, in our setting of atomic load balancing, every single process can have a significant impact on the performance and behavior of the system. Furthermore, although in distributed systems participants can query the current state of the system they cannot reliably predict the effect of their actions on it. We address this issue by considering load balancing games in the bulletin board model, where players can find out the delay on all machines, but do not have information on what their experienced delay would have been if they had selected another machine. We show that under these more realistic assumptions, if all players use the wellknown multiplicative weights algorithm, then the quality of the resulting solution is exponentially better than the worst correlated equilibrium, and almost as good as that of the worst Nash. These tighter bounds are derived from analyzing the dynamics of a multiagent learning system.
Performance and Convergence of Multiuser Online Learning
 Proc. of Internanional Conference on Game Theory for Networks (GAMNETS
, 2011
"... We study the problem of allocating multiple users to a set of wireless channels in a decentralized manner when the channel qualities are timevarying and unknown to the users, and accessing the same channel by multiple users leads to reduced quality (e.g., data rates) received by the users due to in ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
We study the problem of allocating multiple users to a set of wireless channels in a decentralized manner when the channel qualities are timevarying and unknown to the users, and accessing the same channel by multiple users leads to reduced quality (e.g., data rates) received by the users due to interference. In such a setting the users not only need to learn the inherent channel quality and at the same time the best allocations of users to channels so at to maximize the social welfare. Assuming that the users adopt a certain online learning algorithm, we investigate under what conditions the socially optimal allocation is achievable. In particular we examine the effect of different levels of knowledge the users may have and the amount of communications and cooperation. The general conclusion is that when the cooperation of users decreases and the uncertainty about channel payoffs increases it becomes harder to achieve the socially optimal allocation. Specifically, we consider three cases. In the first case, channel rates are generated by an iid process. The users do not know this process or the interference function, and there is no information exchange among users. We show that by using a randomized learning algorithm users converge to the pure Nash equilibria of an equivalent congestion game. In the second case, a user is assumed to know the total number of users, and the number of users on the channel it is using. We show that a samplemean based algorithm can achieve the socially optimal allocation with a sublinear regret in time. In the third case, we show that if the channel rates are constant but unknown, if a user knows the total number of users, then the socially optimal allocation is achieved in finite time with a randomization learning algorithm. I.
A theoretical examination of practical game playing: Lookahead search
 In SAGT, M. Serna, Ed. Lecture Notes in Computer Science Series
, 2012
"... Abstract. Lookahead search is perhaps the most natural and widely used game playing strategy. Given the practical importance of the method, the aim of this paper is to provide a theoretical performance examination of lookahead search in a wide variety of applications. To determine a strategy play us ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract. Lookahead search is perhaps the most natural and widely used game playing strategy. Given the practical importance of the method, the aim of this paper is to provide a theoretical performance examination of lookahead search in a wide variety of applications. To determine a strategy play using lookahead search, each agent predicts multiple levels of possible reactions to her move (via the use of a search tree), and then chooses the play that optimizes her future payoff accounting for these reactions. There are several choices of optimization function the agents can choose, where the most appropriate choice of function will depend on the specifics of the actual game we illustrate this in our examples. Furthermore, the type of search tree chosen by computationallyconstrained agent can vary. We focus on the case where agents can evaluate only a bounded number, k, of moves into the future. That is, we use depth k search trees and call this approach klookahead search. We apply our method in five wellknown settings: industrial organization (Cournot’s model); AdWord auctions; congestion games; validutility games and basicutility games; costsharing network design games. We consider two questions. First, what is