Results 1 - 10
of
12
AWESOME: A General Multiagent Learning Algorithm that Converges in Self-Play and Learns a Best Response against Stationary Opponents
- IN PROCEEDINGS OF THE 20TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING
, 2006
"... Two minimal requirements for a satisfactory multiagent learning algorithm are that it 1. learns to play optimally against stationary opponents and 2. converges to a Nash equilibrium in self-play. The previous algorithm that has come closest, WoLF-IGA, has been proven to have these two properties ..."
Abstract
-
Cited by 57 (5 self)
- Add to MetaCart
Two minimal requirements for a satisfactory multiagent learning algorithm are that it 1. learns to play optimally against stationary opponents and 2. converges to a Nash equilibrium in self-play. The previous algorithm that has come closest, WoLF-IGA, has been proven to have these two properties in 2-player 2-action (repeated) games -- assuming that the opponent's mixed strategy is observable. Another algorithm, ReDVaLeR (which was introduced after the algorithm described in this paper), achieves the two properties in games with arbitrary numbers of actions and players, but still requires that the opponents' mixed strategies are observable. In this paper we present AWESOME, the first algorithm that is guaranteed to have the two properties in games with arbitrary numbers of actions and players. It is still the only algorithm that does so while only relying on observing the other players' actual actions (not their mixed strategies). It also learns to play optimally against opponents that eventually become stationary. The basic idea behind AWESOME (Adapt When Everybody is Stationary, Otherwise Move to Equilibrium) is to try to adapt to the others' strategies when they appear stationary, but otherwise to retreat to a precomputed equilibrium strategy. We provide experimental results that suggest that AWESOME converges fast in practice. The techniques used to prove the properties of AWESOME are fundamentally different from those used for previous algorithms, and may help in analyzing future multiagent learning algorithms as well.
On the Impossibility of Predicting the Behavior of Rational Agents Dean P. Foster
- Proceedings of the National Academy of Sciences of the USA
, 2001
"... A foundational assumption in economics is that people are rational -- they choose optimal plans of action given their predictions about future states of the world. In games of strategy this means that each players' strategy should be optimal given his or her prediction of the opponents' strategies. ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
A foundational assumption in economics is that people are rational -- they choose optimal plans of action given their predictions about future states of the world. In games of strategy this means that each players' strategy should be optimal given his or her prediction of the opponents' strategies. We demonstrate that there is an inherent tension between rationality and prediction when players are uncertain about their opponents' payoff functions. Specifically, there are games in which it is impossible for perfectly rational players to learn to predict the future behavior of their opponents (even approximately) no matter what learning rule they use. The reason is that, in trying to predict the next-period behavior of an opponent, a rational player must take an action this period that the opponent can observe. This observation may cause the opponent to alter his next-period behavior, thus invalidating the first player's prediction. The resulting feedback loop has the property that, in almost every time period, someone predicts that his opponent has a non-negligible probability of choosing one action, when in fact the opponent is certain to choose a different action. We conclude that there are strategic situations where it is impossible in principle for perfectly rational agents to learn to predict the future behavior of other perfectly rational agents, based solely on their observed actions. 3 Rationality vs predictability Economists often assume that people are rational: they maximize their expected payoffs given their beliefs about future states of the world. This hypothesis plays a crucial role in game theory, where each player is assumed to choose an optimal strategy given his belief about the strategies of his opponents. In this setting, a belief amounts to a forec...
A Learning Approach to Auctions
, 1998
"... We analyze a repeated first-price auction in which the types of the players are determined before the first round. It is proved that if every player is using either a belief-based learning scheme with bounded recall or a generalized fictitious play learning scheme, then after sufficiently long time, ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
We analyze a repeated first-price auction in which the types of the players are determined before the first round. It is proved that if every player is using either a belief-based learning scheme with bounded recall or a generalized fictitious play learning scheme, then after sufficiently long time, the players' bids are in equilibrium in the one-shot auction in which the types are commonly known.
On the convergence of fictitious play
- Mathematics of Operations Research
, 1998
"... We study the continuous time Brown-Robinson ctitious play process for non-zero sum games. We show that, in general, ctitious play cannot converge cyclically to a mixed strategy equilibrium in which both players use more than two pure strategies. 1 ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
We study the continuous time Brown-Robinson ctitious play process for non-zero sum games. We show that, in general, ctitious play cannot converge cyclically to a mixed strategy equilibrium in which both players use more than two pure strategies. 1
Generalised weakened fictitious play
, 2004
"... A general class of adaptive processes in games is developed, which significantly generalises weakened fictitious play [Van der Genugten, B., 2000. A weakened form of fictitious play in two-person zero-sum games. Int. Game Theory Rev. 2, 307–328] and includes several interesting fictitious-play-like ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
A general class of adaptive processes in games is developed, which significantly generalises weakened fictitious play [Van der Genugten, B., 2000. A weakened form of fictitious play in two-person zero-sum games. Int. Game Theory Rev. 2, 307–328] and includes several interesting fictitious-play-like processes as special cases. The general model is rigorously analysed using the best response differential inclusion, and shown to converge in games with the fictitious play property. Furthermore, a new actor–critic process is introduced, in which the only information given to a player is the reward received as a result of selecting an action—a player need not even know they are playing a game. It is shown that this results in a generalised weakened fictitious play process, and can therefore be considered as a first step towards explaining how players might learn to play Nash equilibrium strategies without having any knowledge of the game, or even that they are playing a game.
Approximation guarantees for fictitious play
- In Proceedings of the 47th Annual Allerton Conference on Communication, Control, and Computing
, 2009
"... Abstract—Fictitious play is a simple, well-known, and oftenused algorithm for playing (and, especially, learning to play) games. However, in general it does not converge to equilibrium; even when it does, we may not be able to run it to convergence. Still, we may obtain an approximate equilibrium. I ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—Fictitious play is a simple, well-known, and oftenused algorithm for playing (and, especially, learning to play) games. However, in general it does not converge to equilibrium; even when it does, we may not be able to run it to convergence. Still, we may obtain an approximate equilibrium. In this paper, we study the approximation properties that fictitious play obtains when it is run for a limited number of rounds. We show that if both players randomize uniformly over their actions in the first r rounds of fictitious play, then the result is an ǫ-equilibrium, where ǫ = (r + 1)/(2r). (Since we are examining only a constant number of pure strategies, we know that ǫ < 1/2 is impossible, due to a result of Feder et al.) We show that this bound is tight in the worst case; however, with an experiment on random games, we illustrate that fictitious play usually obtains a much better approximation. We then consider the possibility that the players fail to choose the same r. We show how to obtain the optimal approximation guarantee when both the opponent’s r and the game are adversarially chosen (but there is an upper bound R on the opponent’s r), using a linear program formulation. We show that if the action played in the ith round of fictitious play is chosen with probability proportional to: 1 for i = 1 and 1/(i −1) for all 2 ≤ i ≤ R+1, this gives an approximation guarantee of 1 − 1/(2 + ln R). We also obtain a lower bound of 1 − 4/ln R. This provides an actionable prescription for how long to run fictitious play. I.
On the Rate of Convergence of Fictitious Play
, 2010
"... Fictitious play is a simple learning algorithm for strategic games that proceeds in rounds. In each round, the players play a best response to a mixed strategy that is given by the empirical frequencies of actions played in previous rounds. There is a close relationship between fictitious play and t ..."
Abstract
- Add to MetaCart
Fictitious play is a simple learning algorithm for strategic games that proceeds in rounds. In each round, the players play a best response to a mixed strategy that is given by the empirical frequencies of actions played in previous rounds. There is a close relationship between fictitious play and the Nash equilibria of a game: if the empirical frequencies of fictitious play converge to a strategy profile, this strategy profile is a Nash equilibrium. While fictitious play does not converge in general, it is known to do so for certain restricted classes of games, such as constant-sum games, non-degenerate 2 × n games, and potential games. We study the rate of convergence of fictitious play and show that, in all the classes of games mentioned above, fictitious play may require an exponential number of rounds (in the size of the representation of the game) before some equilibrium action is eventually played. In particular, we show the above statement for symmetric constant-sum win-lose-tie games. 1
unknown title
, 2002
"... Preprint manuscript No. (will be inserted by the editor) Continuous fictitious play via projective geometry ..."
Abstract
- Add to MetaCart
Preprint manuscript No. (will be inserted by the editor) Continuous fictitious play via projective geometry
A Learning Perspective on Selfish Behavior in Games
, 2009
"... Computer systems increasingly involve the interaction of multiple self-interested agents. The designers of these systems have objectives they wish to optimize, but by allowing selfish agents to interact in the system, they lose the ability to directly control behavior. What is lost by this lack of ..."
Abstract
- Add to MetaCart
Computer systems increasingly involve the interaction of multiple self-interested agents. The designers of these systems have objectives they wish to optimize, but by allowing selfish agents to interact in the system, they lose the ability to directly control behavior. What is lost by this lack of centralized control? What are the likely outcomes of selfish behavior? In this work, we consider learning dynamics as a tool for better classifying and understanding outcomes of selfish behavior in games. In particular, when such learning algorithms exist and are efficient, we propose “regret-minimization” as a criterion for self-interested behavior and study the system-wide effects in broad classes of games when players achieve this criterion. In addition, we present a general transformation from offline approximation algorithms for linear optimization problems to online algorithms that achieve low regret.
c ○ 2011 Kien Chi NguyenGAME THEORETIC ANALYSIS AND DESIGN FOR NETWORK SECURITY BY
"... Together with the massive and rapid evolution of computer networks, there has been a surge of research interest and activity surrounding network security recently. A secure network has to provide users with confidentiality, authentication, data integrity and nonrepudiation, and availability and acce ..."
Abstract
- Add to MetaCart
Together with the massive and rapid evolution of computer networks, there has been a surge of research interest and activity surrounding network security recently. A secure network has to provide users with confidentiality, authentication, data integrity and nonrepudiation, and availability and access control, among other features. With the evolution of current attacks and the emergence of new attacks, in addition to traditional countermeasures, networked systems have to adopt more quantitative approaches to guarantee these features. In response to this need, we study in this thesis several quantitative approaches based on decision theory and game theory for network security. We first examine decentralized detection problems with a finite number of sensors making conditionally correlated measurements regarding several hypotheses. Each sensor sends to a fusion center an integer from a finite alphabet, and the fusion center makes a decision on the actual hypothesis based on the messages it receives from the sensors. We show that when the observations are conditionally dependent, the Bayesian probability of error can no longer be expressed as a function of the marginal probabilities. We then characterize this probability of error based on the set of joint probabilities of the sensor messages. We show

