Results 1  10
of
16
Revisiting LogLinear Learning: Asynchrony, Completeness and PayoffBased Implementation
, 2008
"... Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around ..."
Abstract

Cited by 42 (14 self)
 Add to MetaCart
Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around explicitly computing the stationary distribution. This analysis relied on a highly structured setting: i) players ’ utility functions constitute a potential game, ii) players update their strategies one at a time, which we refer to as asynchrony, iii) at any stage, a player can select any action in the action set, which we refer to as completeness, and iv) each player is endowed with the ability to assess the utility he would have received for any alternative action provided that the actions of all other players remain fixed. Since the appeal of loglinear learning is not solely the explicit form of the stationary distribution, we seek to address to what degree one can relax the structural assumptions while maintaining that only potential function maximizers are the stochastically stable action profiles. In this paper, we introduce slight variants of loglinear learning to include both synchronous updates and incomplete action sets. In both settings, we prove that only potential function maximizers are stochastically stable. Furthermore, we introduce a payoffbased version of loglinear learning, in which players are only aware of the utility they received and the action that they played. Note that loglinear learning in its original form is not a payoffbased learning algorithm. In payoffbased loglinear learning, we also prove that only potential maximizers are stochastically stable. The key enabler for these results is to change the focus of the analysis away from deriving the explicit form of the stationary distribution of the learning process towards characterizing the stochastically stable states. The resulting analysis uses the theory of resistance trees for regular perturbed Markov decision processes, thereby allowing a relaxation of the aforementioned structural assumptions.
Payoffbased dynamics for multiplayer weakly acyclic games
 SIAM J. CONTROL OPT
, 2009
"... We consider repeated multiplayer games in which players repeatedly and simultaneously choose strategies from a finite set of available strategies according to some strategy adjustment process. We focus on the specific class of weakly acyclic games, which is particularly relevant for multiagent coo ..."
Abstract

Cited by 29 (10 self)
 Add to MetaCart
(Show Context)
We consider repeated multiplayer games in which players repeatedly and simultaneously choose strategies from a finite set of available strategies according to some strategy adjustment process. We focus on the specific class of weakly acyclic games, which is particularly relevant for multiagent cooperative control problems. A strategy adjustment process determines how players select their strategies at any stage as a function of the information gathered over previous stages. Of particular interest are “payoffbased ” processes in which, at any stage, players know only their own actions and (noise corrupted) payoffs from previous stages. In particular, players do not know the actions taken by other players and do not know the structural form of payoff functions. We introduce three different payoffbased processes for increasingly general scenarios and prove that, after a sufficiently large number of stages, player actions constitute a Nash equilibrium at any stage with arbitrarily high probability. We also show how to modify player utility functions through tolls and incentives in socalled congestion games, a special class of weakly acyclic games, to guarantee that a centralized objective can be realized as a Nash equilibrium. We illustrate the methods with a simulation of distributed routing over a network.
Payoff Based Dynamics for MultiPlayer Weakly Acyclic Games
 SIAM JOURNAL ON CONTROL AND OPTIMIZATION, SPECIAL ISSUE ON CONTROL AND OPTIMIZATION IN COOPERATIVE NETWORKS
, 2007
"... We consider repeated multiplayer games in which players repeatedly and simultaneously choose strategies from a finite set of available strategies according to some strategy adjustment process. We focus on the specific class of weakly acyclic games, which is particularly relevant for multiagent coo ..."
Abstract

Cited by 29 (16 self)
 Add to MetaCart
We consider repeated multiplayer games in which players repeatedly and simultaneously choose strategies from a finite set of available strategies according to some strategy adjustment process. We focus on the specific class of weakly acyclic games, which is particularly relevant for multiagent cooperative control problems. A strategy adjustment process determines how players select their strategies at any stage as a function of the information gathered over previous stages. Of particular interest are “payoff based ” processes, in which at any stage, players only know their own actions and (noise corrupted) payoffs from previous stages. In particular, players do not know the actions taken by other players and do not know the structural form of payoff functions. We introduce three different payoff based processes for increasingly general scenarios and prove that after a sufficiently large number of stages, player actions constitute a Nash equilibrium at any stage with arbitrarily high probability. We also show how to modify player utility functions through tolls and incentives in socalled congestion games, a special class of weakly acyclic games, to guarantee that a centralized objective can be realized as a Nash equilibrium. We illustrate the methods with a simulation of distributed routing over a network.
Learning Efficient Nash Equilibria in Distributed Systems
, 2010
"... Abstract. An individual’s learning rule is completely uncoupled if it does not depend on the actions or payoffs of anyone else. We propose a variant of log linear learning that is completely uncoupled and that selects an efficient pure Nash equilibrium in all generic nperson games that possess at l ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
(Show Context)
Abstract. An individual’s learning rule is completely uncoupled if it does not depend on the actions or payoffs of anyone else. We propose a variant of log linear learning that is completely uncoupled and that selects an efficient pure Nash equilibrium in all generic nperson games that possess at least one pure Nash equilibrium. In games that do not have such an equilibrium, there is a simple formula that expresses the longrun probability of the various disequilibrium states in terms of two factors: i) the sum of payoffs over all agents, and ii) the maximum payoff gain that results from a unilateral deviation by some agent. This welfare/stability tradeoff criterion provides a novel framework for analyzing the selection of disequilibrium as well as equilibrium states in nperson games. JEL: C72, C73 1 1. Learning equilibrium in complex interactive systems Game theory has traditionally focussed on situations that involve a small number of players. In these environments it makes sense to assume that players know the structure of the game and can predict the strategic behavior of their opponents. But there are many situations involving huge numbers of players where these assumptions are not particularly persuasive.
Network Formation: Bilateral Contracting and Myopic Dynamics
, 2008
"... We consider a network formation game where a finite number of nodes wish to send traffic to each other. Nodes contract bilaterally with each other to form bidirectional communication links; once the network is formed, traffic is routed along shortest paths (if possible). Cost is incurred to a node ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
We consider a network formation game where a finite number of nodes wish to send traffic to each other. Nodes contract bilaterally with each other to form bidirectional communication links; once the network is formed, traffic is routed along shortest paths (if possible). Cost is incurred to a node from four sources: (1) routing traffic; (2) maintaining links to other nodes; (3) disconnection from destinations the node wishes to reach; and (4) payments made to other nodes. We assume that a network is stable if no single node wishes to unilaterally deviate, and no pair of nodes can profitably deviate together (a variation on the notion of pairwise stability). We study such a game under a form of myopic best response dynamics. In choosing their best strategy, nodes optimize their single period payoff only. We characterize a simple set of assumptions under which these dynamics will converge to a pairwise stable network topology; we also characterize an important special case, where the dynamics converge to a star centered at a node with minimum cost for routing traffic. In this sense, our dynamics naturally select an efficient equilibrium. Further, we show that these assumptions are satisfied by a contractual model motivated by bilateral Rubinstein bargaining with infinitely patient players.
Game Theory and Distributed Control
, 2012
"... Game theory has been employed traditionally as a modeling tool for describing and influencing behavior in societal systems. Recently, game theory has emerged as a valuable tool for controlling or prescribing behavior in distributed engineered systems. The rationale for this new perspective stems fro ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Game theory has been employed traditionally as a modeling tool for describing and influencing behavior in societal systems. Recently, game theory has emerged as a valuable tool for controlling or prescribing behavior in distributed engineered systems. The rationale for this new perspective stems from the parallels between the underlying decision making architectures in both societal systems and distributed engineered systems. In particular, both settings involve an interconnection of decision making elements whose collective behavior depends on a compilation of local decisions that are based on partial information about each other and the state of the world. Accordingly, there is extensive work in game theory that is relevant to the engineering agenda. Similarities notwithstanding, there remain important differences between the constraints and objectives in societal and engineered systems that require looking at game theoretic methods from a new perspective. This chapter provides an overview of selected recent developments of game theoretic methods in this role as a framework for distributed control in engineered systems.
Local TwoStage Myopic Dynamics for Network Formation Games
"... Abstract. Network formation games capture two conflicting objectives of selfinterested nodes in a network. On one hand, such a node wishes to be able to reach all other nodes in the network; on the other hand, it wishes to minimize its cost of participation. We focus on myopic dynamics in a class of ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Network formation games capture two conflicting objectives of selfinterested nodes in a network. On one hand, such a node wishes to be able to reach all other nodes in the network; on the other hand, it wishes to minimize its cost of participation. We focus on myopic dynamics in a class of such games inspired by transportation and communication models. A key property of the dynamics we study is that they are local: nodes can only deviate to form links with others in a restricted neighborhood. Despite this locality, we find that our dynamics converge to efficient or nearly efficient outcomes in a range of settings of interest. 1
Emergent Collective Behavior in MultiAgent Systems: An Evolutionary Perspective
"... The study of collective behavior involves the analysis of interactions among a set of agents that yield collective outcomes at the level of the group. The behavior is said to be emergent when it cannot be understood simply as the sum of its constituent parts. Further, grouplevel outcomes can in tur ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
The study of collective behavior involves the analysis of interactions among a set of agents that yield collective outcomes at the level of the group. The behavior is said to be emergent when it cannot be understood simply as the sum of its constituent parts. Further, grouplevel outcomes can in turn influence individual interactions. The complexity of this interplay makes the study of emergence challenging and exciting. This dissertation is focused on the study of emergent collective behavior from the perspective of evolution. Evolution is a simple yet powerful algorithm, which when acting on interacting entities in a dynamic environment, yields an array of fascinating behavior as manifest in the natural world. Natural collectives display a wide variety of cooperative behavior and have evolved to efficiently manage the inherent tradeoff between robust behavior and adaptability to dynamic environments. These properties have motivated the design of bioinspired algorithms for sensing and decisionmaking in robotic collectives. In this work, we study the evolutionary mechanisms for cooperation and tradeoff management in biological collectives, with a focus on four related topics: replicatormutator dynamics, collective migration, collective pursuit
Robustness of Stochastic Stability in Game Theoretic Learning
"... Abstract—The notion of stochastic stability is used in game theoretic learning to characterize which joint actions of players exhibit high probabilities of occurrence in the long run. This paper examines the impact of two types of errors on stochastic stability: i) small unstructured uncertainty in ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract—The notion of stochastic stability is used in game theoretic learning to characterize which joint actions of players exhibit high probabilities of occurrence in the long run. This paper examines the impact of two types of errors on stochastic stability: i) small unstructured uncertainty in the game parameters and ii) slow time variations of the game parameters. In the first case, we derive a continuity result bounds the effects of small uncertainties. In the second case, we show that game play tracks drifting stochastically stable states under sufficiently slow time variations. The analysis is in terms of Markov chains and hence is applicable to a variety of game theoretic learning rules. Nonetheless, the approach is illustrated on the widely studied rule of loglinear learning. Finally, the results are applied in both simulation and laboratory experiments to distributed area coverage with mobile robots. I.
Cooperative learning in multiagent systems from intermittent measurements
"... Abstract — Motivated by the problem of decentralized directiontracking, we consider the general problem of cooperative learning in multiagent systems with timevarying connectivity and intermittent measurements. We propose a distributed learning protocol capable of learning an unknown vector µ fro ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract — Motivated by the problem of decentralized directiontracking, we consider the general problem of cooperative learning in multiagent systems with timevarying connectivity and intermittent measurements. We propose a distributed learning protocol capable of learning an unknown vector µ from noisy measurements made independently by autonomous nodes. Our protocol is completely distributed and able to cope with the timevarying, unpredictable, and noisy nature of interagent communication, and intermittent noisy measurements of µ. Our main result bounds the learning speed of our protocol in terms of the size and combinatorial features of the (timevarying) network connecting the nodes. I.