Results 11  20
of
33
Efficiency and equilibrium in trial and error learning
, 2010
"... Abstract. In trial and error learning, agents experiment with new strategies and adopt them with a probability that depends on their realized payoffs. Such rules are completely uncoupled, that is, each agent's behaviour depends only on his own realized payoffs and not on the payoffs or actions ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In trial and error learning, agents experiment with new strategies and adopt them with a probability that depends on their realized payoffs. Such rules are completely uncoupled, that is, each agent's behaviour depends only on his own realized payoffs and not on the payoffs or actions of anyone else. We show that by modifying a trial and error learning rule proposed by Young (2009) we obtain a completely uncoupled learning process that selects a Pareto optimal equilibrium whenever a pure equilibrium exists. When a pure equilibrium does not exist, there is a simple formula that relates the longrun likelihood of each disequilibrium state to the total payoff over all agents and the maximum payoff gain that would result from a unilateral deviation by some agent. This welfare/stability tradeoff criterion provides a novel framework for analyzing the selection of disequilibrium as well as equilibrium states in finite nperson games. Acknowledgements. We thank Gabriel Kreindler for suggesting a number of improvements
Game Couplings: Learning Dynamics and Applications
"... Modern engineering systems (such as the Internet) consist of multiple coupled subsystems. Such subsystems are designed with local (possibly conflicting) goals, with little or no knowledge of the implementation details of other subsystems. Despite the ubiquitous nature of such systems very little is ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Modern engineering systems (such as the Internet) consist of multiple coupled subsystems. Such subsystems are designed with local (possibly conflicting) goals, with little or no knowledge of the implementation details of other subsystems. Despite the ubiquitous nature of such systems very little is formally known about their properties and global dynamics. We investigate such distributed systems by introducing a novel gametheoretic construct, that we call gamecoupling. Game coupling intuitively allows us to stitch together the payoff structures of subgames. In order to study efficiency issues, we extend the price of anarchy approach (a major focus of gametheoretical multiagent systems [22]) to this setting, where we now care about the performance of each individual subsystem as well as the global performance. Such concerns give rise to a new notion of equilibrium, as well as a new learning paradigm. We prove matching welfare guarantees for both, both for individual subsystems as well as for the global system, using a generalization of the (λ, µ)smoothness framework [19]. In the second part of the paper, we work on understanding conditions that allow for wellstructured couplings. More generally, we examine when do game couplings preserve or enhance desirable properties of the original games, such as convergence of best response dynamics and low price of anarchy.
Hedonic coalition formation for optimal deployment
 Automatica
, 2013
"... Abstract This paper presents a distributed algorithmic solution, termed Coalition formation and deployment algorithm, to achieve network configurations where agents cluster into coincident groups that are distributed optimally over the environment. The motivation for this problem comes from spatial ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract This paper presents a distributed algorithmic solution, termed Coalition formation and deployment algorithm, to achieve network configurations where agents cluster into coincident groups that are distributed optimally over the environment. The motivation for this problem comes from spatial estimation tasks executed with unreliable sensors. We propose a probabilistic strategy that combines a repeated game governing the formation of coalitions with a spatial motion component governing their location. For a class of probabilistic coalition switching laws, we establish the convergence of the agents to coincident groups of a desired size in finite time and the asymptotic convergence of the overall network to the optimal deployment, both with probability 1. We also investigate the algorithm's time and communication complexity. Specifically, we upper bound the expected completion time of executions that use the proportionaltonumberofunmatchedagents coalition switching law under arbitrary and complete communication topologies. We also upper bound the number of messages required per timestep to execute our strategy. The proposed algorithm is robust to agent addition and subtraction. From a coalitional game perspective, the algorithm is novel in that the players' information is limited to neighboring clusters. From a motion coordination perspective, the algorithm is novel because it brings together the basic tasks of rendezvous (individual agents into clusters) and deployment (clusters in the environment). Simulations illustrate the correctness, robustness, and complexity results.
Harvard University
"... In large systems, it is important for agents to learn to act effectively, but sophisticated multiagent learning algorithms generally do not scale. An alternative approach is to find restricted classes of games where simple, efficient algorithms converge. It is shown that stage learning efficiently ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
In large systems, it is important for agents to learn to act effectively, but sophisticated multiagent learning algorithms generally do not scale. An alternative approach is to find restricted classes of games where simple, efficient algorithms converge. It is shown that stage learning efficiently converges to Nash equilibria in large anonymous games if bestreply dynamics converge. Two features are identified that improve convergence. First, rather than making learning more difficult, more agents are actually beneficial in many settings. Second, providing agents with statistical information about the behavior of others can significantly reduce the number of observations needed. 1.
Learning in a Black Box
, 2013
"... Many interactive environments can be represented as games, but they are so large and complex that individual players are in the dark about what others are doing and how their own payoffs are affected. This paper analyzes learning behavior in such ‘black box’ environments, where players ’ only source ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Many interactive environments can be represented as games, but they are so large and complex that individual players are in the dark about what others are doing and how their own payoffs are affected. This paper analyzes learning behavior in such ‘black box’ environments, where players ’ only source of information is their own history of actions taken and payoffs received. Specifically we study repeated public goods games, where players must decide how much to contribute at each stage, but they do not know how much others have contributed or how others ’ contributions affect their own payoffs. We identify two key features of the players ’ learning dynamics. First, if a player’s realized payoff increases he is less inclined to change his strategy, whereas if his realized payoff decreases he is more inclined to change his strategy. Second, if increasing his own contribution results in higher payoffs he will tend to increase his contribution still further, whereas the reverse holds if an increase in contribution leads to lower payoffs. These two effects are clearly present when players have no information about the game; moreover they are still present even when players have full information. Convergence to Nash equilibrium
Distributed algorithms for networked multiagent systems: optimization and competition
, 2013
"... This thesis pertains to the development of distributed algorithms in the context of networked multiagent systems. Such engineered systems may be tasked with a variety of goals, ranging from the solution of optimization problems to addressing the solution of variational inequality problems. Two key ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
This thesis pertains to the development of distributed algorithms in the context of networked multiagent systems. Such engineered systems may be tasked with a variety of goals, ranging from the solution of optimization problems to addressing the solution of variational inequality problems. Two key complicating characteristics of multiagent systems are the following: (i) the lack of availability of systemwide information at any given location; and (ii) the absence of any central coordinator. These intricacies make it infeasible to collect all the information at a location and preclude the use of centralized algorithms. Consequently, a fundamental question in the design of such systems is the need for developing algorithms that can support their functioning. Accordingly, our goal lies in developing distributed algorithms that can be implemented at a local level while guaranteeing a global systemlevel requirement. In such techniques, each agent uses locally available information, including that accessible from its immediate neighbors, to update its decisions, rather than availing of the decisions of all agents. This thesis focuses on multiagent systems tasked with the solution of three sets of problems: (i) convex optimization problems; (ii) Cartesian variational inequality problems; and (iii) a subclass of Nash games.
Payoffbased Inhomogeneous Partially Irrational Play for Potential Game Theoretic Cooperative Control: Convergence Analysis
"... Abstract — This paper investigates learning algorithm design in potential game theoretic cooperative control, where it is in general required for agents ’ collective action to converge to the most efficient equilibria while standard game theory aims at just computing a Nash equilibrium. In particula ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract — This paper investigates learning algorithm design in potential game theoretic cooperative control, where it is in general required for agents ’ collective action to converge to the most efficient equilibria while standard game theory aims at just computing a Nash equilibrium. In particular, the equilibria maximizing the potential function should be selected in case the utility functions are already aligned to a global objective function. In order to meet the requirement, this paper develops a learning algorithm called Payoffbased Inhomogeneous Partially Irrational Play (PIPIP). The main feature of PIPIP is to allow agents to make irrational decisions with a specified probability, i.e. agents can choose an action with a low utility from the past actions stored in the memory. We then prove convergence in probability of the collective action to the potential function maximizers. Finally, the effectiveness of the present algorithm is demonstrated through simulation on a sensor coverage problem. I.
ScienceDirect A behavioral study of "noise" in coordination gamesNCND license (http://creativecommons.org/licenses/byncnd/4.0/)
"... Abstract 'Noise' in this study, in the sense of evolutionary game theory, refers to deviations from prevailing behavioral rules. Analyzing data from a laboratory experiment on coordination in networks, we tested 'what kind of noise' is supported by behavioral evidence. This empi ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract 'Noise' in this study, in the sense of evolutionary game theory, refers to deviations from prevailing behavioral rules. Analyzing data from a laboratory experiment on coordination in networks, we tested 'what kind of noise' is supported by behavioral evidence. This empirical analysis complements a growing theoretical literature on 'how noise matters' for equilibrium selection. We find that the vast majority of decisions (96%) constitute myopic best responses, but deviations continue to occur with probabilities that are sensitive to their costs, that is, less frequent when implying larger payoff losses relative to the myopic best response. In addition, deviation rates vary with patterns of realized payoffs that are related to trialanderror behavior. While there is little evidence that deviations are clustered in time or space, there is evidence of individual heterogeneity.
Stochastic Learning Dynamics and Speed of Convergence in Population Games * Itai Arieli
"... Abstract We study how long it takes for large populations of interacting agents to come close to Nash equilibrium when they adapt their behavior using a stochastic better reply dynamic. Prior work considers this question mainly for 2×2 games and potential games; here we characterize convergence tim ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We study how long it takes for large populations of interacting agents to come close to Nash equilibrium when they adapt their behavior using a stochastic better reply dynamic. Prior work considers this question mainly for 2×2 games and potential games; here we characterize convergence times for general weakly acyclic games, including coordination games, dominance solvable games, games with strategic complementarities, potential games, and many others with applications in economics, biology, and distributed control. If players' better replies are governed by idiosyncratic shocks, the convergence time can grow exponentially in the population size; moreover this is true even in games with very simple payoff structures. However, if their responses are sufficiently correlated due to aggregate shocks, the convergence time is greatly accelerated; in fact it is bounded for all sufficiently large populations. We provide explicit bounds on the speed of convergence as a function of key structural parameters including the number of strategies, the length of the better reply paths, the extent to which players can influence the payoffs of others, and the desired degree of approximation to Nash equilibrium. * The authors thank