Results 1  10
of
33
Learning by Trial and Error
, 2008
"... A person learns by trial and error if he occasionally tries out new strategies, rejecting choices that are erroneous in the sense that they do not lead to higher payoffs. In a game, however, strategies can become erroneous due to a change of behavior by someone else. We introduce a learning rule in ..."
Abstract

Cited by 32 (4 self)
 Add to MetaCart
A person learns by trial and error if he occasionally tries out new strategies, rejecting choices that are erroneous in the sense that they do not lead to higher payoffs. In a game, however, strategies can become erroneous due to a change of behavior by someone else. We introduce a learning rule in which behavior is conditional on whether a player experiences an error of the first or second type. This rule, called interactive trial and error learning, implements Nash equilibrium behavior in any game with generic payoffs and at least one pure Nash equilibrium. JEL Classification: C72, D83
Learning Efficient Nash Equilibria in Distributed Systems
, 2010
"... Abstract. An individual’s learning rule is completely uncoupled if it does not depend on the actions or payoffs of anyone else. We propose a variant of log linear learning that is completely uncoupled and that selects an efficient pure Nash equilibrium in all generic nperson games that possess at l ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
Abstract. An individual’s learning rule is completely uncoupled if it does not depend on the actions or payoffs of anyone else. We propose a variant of log linear learning that is completely uncoupled and that selects an efficient pure Nash equilibrium in all generic nperson games that possess at least one pure Nash equilibrium. In games that do not have such an equilibrium, there is a simple formula that expresses the longrun probability of the various disequilibrium states in terms of two factors: i) the sum of payoffs over all agents, and ii) the maximum payoff gain that results from a unilateral deviation by some agent. This welfare/stability tradeoff criterion provides a novel framework for analyzing the selection of disequilibrium as well as equilibrium states in nperson games. JEL: C72, C73 1 1. Learning equilibrium in complex interactive systems Game theory has traditionally focussed on situations that involve a small number of players. In these environments it makes sense to assume that players know the structure of the game and can predict the strategic behavior of their opponents. But there are many situations involving huge numbers of players where these assumptions are not particularly persuasive.
Designing Games for Distributed Optimization
"... Abstract — The central goal in multiagent systems is to design local control laws for the individual agents to ensure that the emergent global behavior is desirable with respect to a given system level objective. Ideally, a system designer seeks to satisfy this goal while conditioning each agent’s c ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
(Show Context)
Abstract — The central goal in multiagent systems is to design local control laws for the individual agents to ensure that the emergent global behavior is desirable with respect to a given system level objective. Ideally, a system designer seeks to satisfy this goal while conditioning each agent’s control law on the least amount of information possible. Unfortunately, there are no existing methodologies for addressing this design challenge. The goal of this paper is to address this challenge using the field of game theory. Utilizing game theory for the design and control of multiagent systems requires two steps: (i) defining a local objective function for each decision maker and (ii) specifying a distributed learning algorithm to reach a desirable operating point. One of the core advantages of this game theoretic approach is that this two step process can be decoupled by utilizing specific classes of games. For example, if the designed objective functions result in a potential game then the system designer can utilize distributed learning algorithms for potential games to complete step (ii) of the design process. Unfortunately, designing agent objective functions to meet objectives such as locality of information and efficiency of resulting equilibria within the framework of potential games is fundamentally challenging and in many case impossible. In this paper we develop a systematic methodology for meeting these objectives using a broader framework of games termed state based potential games. State based potential games is an extension of potential games where an additional state variable is introduced into the game environment hence permitting more flexibility in our design space. Furthermore, state based potential games possess an underlying structure that can be exploited by distributed learning algorithms in a similar fashion to potential games hence providing a new baseline for our decomposition. I.
Convergence to equilibrium of logit dynamics for strategic games. CoRR abs/1212.1884. Preliminary version appeared in SPAA
, 2011
"... We present the first general bounds on the mixing time of logit dynamics for wide classes of strategic games. The logit dynamics describes the behaviour of a complex system whose individual components act selfishly and keep responding according to some partial (“noisy”) knowledge of the system. In p ..."
Abstract

Cited by 10 (7 self)
 Add to MetaCart
(Show Context)
We present the first general bounds on the mixing time of logit dynamics for wide classes of strategic games. The logit dynamics describes the behaviour of a complex system whose individual components act selfishly and keep responding according to some partial (“noisy”) knowledge of the system. In particular, we prove nearly tight bounds for potential games and games with dominant strategies. Our results show that, for potential games, the mixing time is upper and lower bounded by an exponential in the inverse of the noise and in the maximum potential difference. Instead, for games with dominant strategies, the mixing time cannot grow arbitrarily with the inverse of the noise. Finally, we refine our analysis for a subclass of potential games called graphical coordination games and we give evidence that the mixing time strongly depends on the structure of the underlying graph. Games in this class have been previously studied in Physics and, more recently, in Computer Science in the context of diffusion of new technologies. Categories andSubject Descriptors
Aspiration learning in coordination games
 in IEEE Conference on Decision and Control
, 2010
"... Abstract — We consider the problem of distributed convergence to efficient outcomes in coordination games through payoffbased learning dynamics, namely aspiration learning. The proposed learning scheme assumes that players reinforce well performed actions, by successively playing these actions, oth ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
Abstract — We consider the problem of distributed convergence to efficient outcomes in coordination games through payoffbased learning dynamics, namely aspiration learning. The proposed learning scheme assumes that players reinforce well performed actions, by successively playing these actions, otherwise they randomize among alternative actions. Our first contribution is the characterization of the asymptotic behavior of the induced Markov chain of the iterated process by an equivalent finitestate Markov chain, which simplifies previously introduced analysis on aspiration learning. We then characterize explicitly the behavior of the proposed aspiration learning in a generalized version of socalled coordination games, an example of which is network formation games. In particular, we show that in coordination games the expected percentage of time that the efficient action profile is played can become arbitrarily large. I.
Distributed Selfish Load Balancing on Networks
"... We study distributed load balancing in networks with selfish agents. In the simplest model considered here, there are n identical machines represented by vertices in a network and m ≫ n selfish agents that unilaterally decide to move from one vetex to another if this improves their experienced load. ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
We study distributed load balancing in networks with selfish agents. In the simplest model considered here, there are n identical machines represented by vertices in a network and m ≫ n selfish agents that unilaterally decide to move from one vetex to another if this improves their experienced load. We present several protocols for concurrent migration that satisfy desirable properties such as being based only on local information and computation and the absence of global coordination or cooperation of agents. Our main contribution is to show rapid convergence of the resulting migration process to states that satisfy different stability or balance criteria. In particular, the convergence time to a Nash equilibrium is only logarithmic in m and polynomial in n, where the polynomial depends on the graph structure. Using a slight modification with neutral moves, a perfectly balanced state can be reached after additional time polynomial in n. Inaddition, we show reduced convergence times to approximate Nash equilibria. Finally, we extend our results to networks of machines with different speeds or to agents that have different weights and show similar results for convergence to approximate and exact Nash equilibria. 1
On the Structure of Weakly Acyclic Games ⋆
"... Abstract. The class of weakly acyclic games, which includes potential games and dominancesolvable games, captures many practical application domains. Informally, a weakly acyclic game is one where natural distributed dynamics, such as betterresponse dynamics, cannot enter inescapable oscillations. ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
Abstract. The class of weakly acyclic games, which includes potential games and dominancesolvable games, captures many practical application domains. Informally, a weakly acyclic game is one where natural distributed dynamics, such as betterresponse dynamics, cannot enter inescapable oscillations. We establish a novel link between such games and the existence of pure Nash equilibria in subgames. Specifically, we show that the existence of a unique pure Nash equilibrium in every subgame implies the weak acyclicity of a game. In contrast, the possible existence of multiple pure Nash equilibria in every subgame is insufficient for weak acyclicity. 1
Multiagent learning in large anonymous games
 Journal of Artificial Intelligence Research
, 2011
"... Abstract In large systems, it is important for agents to learn to act effectively, but sophisticated multiagent learning algorithms generally do not scale. An alternative approach is to find restricted classes of games where simple, efficient algorithms converge. It is shown that stage learning ef ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract In large systems, it is important for agents to learn to act effectively, but sophisticated multiagent learning algorithms generally do not scale. An alternative approach is to find restricted classes of games where simple, efficient algorithms converge. It is shown that stage learning efficiently converges to Nash equilibria in large anonymous games if bestreply dynamics converge. Two features are identified that improve convergence. First, rather than making learning more difficult, more agents are actually beneficial in many settings. Second, providing agents with statistical information about the behavior of others can significantly reduce the number of observations needed.
WeaklyAcyclic (Internet) Routing Games
"... Abstract. Weaklyacyclic games – a superclass of potential games – capture distributed environments where simple, globallyasynchronous interactions between strategic agents are guaranteed to converge to an equilibrium. We explore the class of routing games in [4, 12], which models important aspects ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Abstract. Weaklyacyclic games – a superclass of potential games – capture distributed environments where simple, globallyasynchronous interactions between strategic agents are guaranteed to converge to an equilibrium. We explore the class of routing games in [4, 12], which models important aspects of routing on the Internet. We show that, in interesting contexts, such routing games are weakly acyclic and, moreover, that pure Nash equilibria in such games can be found in a computationally efficient manner. 1
Sampled Fictitious Play for Approximate Dynamic Programming
, 2011
"... Sampled Fictitious Play (SFP) is a recently proposed iterative learning mechanism for computing Nash equilibria of noncooperative games. For games of identical interests, every limit point of the sequence of mixed strategies induced by the empirical frequencies of best response actions that players ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Sampled Fictitious Play (SFP) is a recently proposed iterative learning mechanism for computing Nash equilibria of noncooperative games. For games of identical interests, every limit point of the sequence of mixed strategies induced by the empirical frequencies of best response actions that players in SFP play is a Nash equilibrium. Because discrete optimization problems can be viewed as games of identical interests wherein Nash equilibria define a type of local optimum, SFP has recently been employed as a heuristic optimization algorithm with promising empirical performance. However there have been no guarantees of convergence to a globally optimal Nash equilibrium established for any of the problem classes considered to date. In this paper, we introduce a variant of SFP and show that it converges almost surely to optimal policies in modelfree, finitehorizon stochastic dynamic programs. The key idea is to view the dynamic programming states as players, whose common interest is to maximize the total multiperiod expected reward starting in a fixed initial state. We also offer empirical results suggesting that our SFP variant is effective in practice for small to moderate sized modelfree problems.