#### DMCA

## Regret based dynamics: Convergence in weakly acyclic games (2007)

### Cached

### Download Links

- [www.prism.gatech.edu]
- [ecee.colorado.edu]
- [www2.hawaii.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS |

Citations: | 28 - 11 self |

### Citations

4105 |
Game theory
- Fudenberg, Tirole
- 1991
(Show Context)
Citation Context ...et of n dimensional probability distributions. — For x ∈ R n , [x] + ∈ R n denotes the vector whose i−th entry equals max(xi, 0). 2. SETUP 2.1 Finite Strategic-Form Games A finite strategic-form game =-=[5]-=- consists of an n-player set P := {P1, ..., Pn}, a finite action set Yi for each player Pi ∈ P, and a utility function Ui : Y → R for each player Pi ∈ P, where Y := Y1 × ... × Yn. We will henceforth u... |

1126 |
The Theory of Learning in Games
- Fudenberg
- 1997
(Show Context)
Citation Context ...ered multi-agent systems will lead to weakly acyclic games. 2.4 Repeated Games Here, we deal with the issue of how players can learn to play a pure Nash equilibrium through repeated interactions; see =-=[4, 23, 22, 11, 19, 18]-=-. We assume that each player has access to its own utility function but not to the utility functions of other players. This private utilities assumption is motivated in multi-agent systems by the requ... |

984 | Evolutionary Game Theory
- Weibull
- 1995
(Show Context)
Citation Context ...ered multi-agent systems will lead to weakly acyclic games. 2.4 Repeated Games Here, we deal with the issue of how players can learn to play a pure Nash equilibrium through repeated interactions; see =-=[4, 23, 22, 11, 19, 18]-=-. We assume that each player has access to its own utility function but not to the utility functions of other players. This private utilities assumption is motivated in multi-agent systems by the requ... |

856 |
Evolutionary Games and Population Dynamics
- Hofbauer, Sigmund
- 1998
(Show Context)
Citation Context ...ered multi-agent systems will lead to weakly acyclic games. 2.4 Repeated Games Here, we deal with the issue of how players can learn to play a pure Nash equilibrium through repeated interactions; see =-=[4, 23, 22, 11, 19, 18]-=-. We assume that each player has access to its own utility function but not to the utility functions of other players. This private utilities assumption is motivated in multi-agent systems by the requ... |

751 |
Some theoretical aspects of road traffic research
- Wardrop
- 1952
(Show Context)
Citation Context ...ion cost on each road converges approximately to the same value, which is consistent with a Nash equilibrium with large number of drivers. This behavior resembles an approximate “Wardrop equilibrium” =-=[23]-=-, whichrepresentsasteady-statesituationinwhichthecongestion cost on each road is equal due to the fact that, as the number of drivers increases, the effect of an individual driver on the traffic condi... |

572 | Potential games
- Monderer, Shapley
- 1996
(Show Context)
Citation Context ...ems where agent objectives are designed to achieve an overall objective [21, 20]. In such systems, at least one pure Nash equilibrium exists by design. 2.2 Potential Games Consider the class of games =-=[17]-=- where player utilities {Ui} n i=1 are aligned with a global utility φ : Y ↦→ R in the following sense: For every player, Pi ∈ P, for every y−i ∈ Y−i, and for every ¯yi, y ′′ i ∈ Yi, Ui(¯yi, y−i) − Ui... |

557 |
A class of games possessing pure-strategy nash equilibria
- Rosenthal
- 1973
(Show Context)
Citation Context ... The negative sign stems from cr(·) reflecting the cost of using a resource and Ui(·) reflecting a utility or reward function. Any congestion game with utility functions as in (4) is a potential game =-=[21, 20]-=- with the potential function φ(y) =− ∑ σr(y) ∑ r∈R k=1 cr(k). 5.2.2 Distributed Traffic Routing Example We consider a simple scenario with 100 players seeking to traverse from node A to node B along 5... |

435 |
Individual strategy and social structure: An evolutionary theory of institutions
- Young
- 1998
(Show Context)
Citation Context ... action profile maximizing the potential function is always a pure Nash equilibrium, i.e., at least one pure Nash equilibrium exists in potential games. 2.3 Weakly Acyclic Games A weakly acyclic game =-=[22, 23]-=- is defined by the following condition: For any action profile y ∈ Y , there exists a finite of sequence of action profiles y 1 , y 2 , ..., y L such that 1. y 1 = y, and 2. For any 1 ≤ ℓ ≤ L − 1, the... |

327 | A simple adaptive procedure leading to correlated equilibrium
- Hart, Mas-Colell
(Show Context)
Citation Context ...d algorithm guarantees that a player’s maximum regret asymptotically approaches zero then the algorithm is referred to as a no-regret algorithm. The most common no-regret algorithm is regret matching =-=[8]-=-. In regret matching, at each time step, each player plays a strategy where the probability of playing an action is proportional to the positive part of his regret for that action. In a multi-agent sy... |

319 |
Evolutionary Games and Equilibrium Selection
- Samuelson
- 1997
(Show Context)
Citation Context |

182 | Efficient algorithms for online decision problems
- Kalai, Vempala
(Show Context)
Citation Context ...ltiagent learning; Emergent behavior; Coordination, cooperation, and teamwork. 1. INTRODUCTION The applicability of regret-based algorithms for multi-agent learning has been studied in several papers =-=[6, 3, 13, 1, 7]-=-. The appeal of regret-based algorithms is two fold. First of all, regret-based algorithms are easily implementable in large scale multi-agent systems when compared with other learning algorithms such... |

140 |
Fictitious play property for games with identical interests
- Monderer, Shapley
- 1996
(Show Context)
Citation Context ...et-based algorithms is two fold. First of all, regret-based algorithms are easily implementable in large scale multi-agent systems when compared with other learning algorithms such as fictitious play =-=[16, 12]-=-. Secondly, ∗ Jason R. Marden is a Ph.D. student in the Department of Mechanical and Aerospace Engineering, University of California, Los Angeles. Gürdal Arslan Department of Electrical Engineering Un... |

136 | An overview of collective intelligence
- Wolpert, Tumor
- 1999
(Show Context)
Citation Context ... general, a pure Nash equilibrium may not exist for an arbitrary game. However, we are interested in engineered multi-agent systems where agent objectives are designed to achieve an overall objective =-=[21, 20]-=-. In such systems, at least one pure Nash equilibrium exists by design. 2.2 Potential Games Consider the class of games [17] where player utilities {Ui} n i=1 are aligned with a global utility φ : Y ↦... |

127 |
Strategic learning and its limits
- Young
- 2004
(Show Context)
Citation Context ... action profile maximizing the potential function is always a pure Nash equilibrium, i.e., at least one pure Nash equilibrium exists in potential games. 2.3 Weakly Acyclic Games A weakly acyclic game =-=[22, 23]-=- is defined by the following condition: For any action profile y ∈ Y , there exists a finite of sequence of action profiles y 1 , y 2 , ..., y L such that 1. y 1 = y, and 2. For any 1 ≤ ℓ ≤ L − 1, the... |

101 | A general class of adaptive strategies
- Hart, Mas-Colell
- 2001
(Show Context)
Citation Context ...r action in the past steps. It turns out that the average regret of a player using regret matching would asymptotically vanish (similar results hold for different regret based adaptive dynamics); see =-=[8, 9, 10]-=-. This would imply that the empirical frequencies of the action profiles y(k) would almost surely converge to the set of coarse correlated equilibria 1 , where a coarse correlated equilibrium is any p... |

92 | Optimal payoff functions for members of collectives
- Wolpert, Tumer
- 2001
(Show Context)
Citation Context ... general, a pure Nash equilibrium may not exist for an arbitrary game. However, we are interested in engineered multi-agent systems where agent objectives are designed to achieve an overall objective =-=[21, 20]-=-. In such systems, at least one pure Nash equilibrium exists by design. 2.2 Potential Games Consider the class of games [17] where player utilities {Ui} n i=1 are aligned with a global utility φ : Y ↦... |

86 | Autonomous vehicle-target assignment: a game theoretical formulation
- arslan, Marden, et al.
- 2007
(Show Context)
Citation Context ...ee. AAMAS’07, May 14–18, 2007, Honolulu, Hawai'i, USA. Copyright 2007 IFAAMAS. 1. INTRODUCTION The applicability of regret based algorithms for multi-agent learning has been studied in several papers =-=[7, 4, 14, 2, 8, 1]-=-. The appeal of regret based algorithms is two fold. First of all, regret based algorithms are easily implementable in large scale multi-agent systems when compared with other learning algorithms such... |

83 | Convergence and no-regret in multiagent learning
- Bowling
- 2005
(Show Context)
Citation Context ...ltiagent learning; Emergent behavior; Coordination, cooperation, and teamwork. 1. INTRODUCTION The applicability of regret-based algorithms for multi-agent learning has been studied in several papers =-=[6, 3, 13, 1, 7]-=-. The appeal of regret-based algorithms is two fold. First of all, regret-based algorithms are easily implementable in large scale multi-agent systems when compared with other learning algorithms such... |

57 | Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games
- Blum, Even-Dar, et al.
- 2006
(Show Context)
Citation Context ...would have” received had that player used a different fixed strategy at all previous time steps. No-regret algorithms have been proposed in a variety of settings ranging from network routing problems =-=[3]-=- to structured prediction problems [7]. In the more general regret based algorithms, each player makes a decision using only information regarding the regret for each of his possible actions. If an al... |

56 | Joint strategy fictitious play with inertia for potential games
- Marden, Arslan, et al.
- 2009
(Show Context)
Citation Context ...high probability, the action corresponding to the maximum regret. This choice leads to a stochastic variant of an algorithm called Joint Strategy Fictitious Play (with fading memory and inertia); see =-=[14]-=-. Also, note that, for large values of τ, player Pi would choose any action having positive regret with equal probability. According to these rules, player Pi will stay with his previous action yi(k −... |

36 | Connections between cooperative control and potential games illustrated on the consensus problem
- Marden, Arslan, et al.
- 2007
(Show Context)
Citation Context ...ered here can not fully model aswide variety of multi-agent systems. It turns out that weakly acyclic games, which is a generalized form of potential games, are closely related to multi-agent systems =-=[15]-=-. The connection can be seen by recognizing that in any multi-agent system there is a global objective that may or may not be known to the players. In the case that the players are aware of the global... |

33 | On no-regret learning, fictitious play, and Nash equilibrium
- Jafari, Greenwald, et al.
- 2001
(Show Context)
Citation Context ...et-based algorithms is two fold. First of all, regret-based algorithms are easily implementable in large scale multi-agent systems when compared with other learning algorithms such as fictitious play =-=[16, 12]-=-. Secondly, ∗ Jason R. Marden is a Ph.D. student in the Department of Mechanical and Aerospace Engineering, University of California, Los Angeles. Gürdal Arslan Department of Electrical Engineering Un... |

28 | Payoff based dynamics for multi-player weakly acyclic games
- Marden, Young, et al.
- 2009
(Show Context)
Citation Context ...aluate congestion levels on alternative routes. On the other hand, if a player is only aware of the congestion experienced, then one would need to examine the applicability of payoff based algorithms =-=[18]-=-. 6. CONCLUSIONS In this paper we analyzed the applicability of regret based algorithms on multi-agent systems. We demonstrated that a point of no-regret may not necessarily be a desirable operating c... |

22 | A general class of no-regret learning algorithms and game-theoretic equilibria
- Greenwald, Jafari
- 2003
(Show Context)
Citation Context ...ltiagent learning; Emergent behavior; Coordination, cooperation, and teamwork. 1. INTRODUCTION The applicability of regret-based algorithms for multi-agent learning has been studied in several papers =-=[6, 3, 13, 1, 7]-=-. The appeal of regret-based algorithms is two fold. First of all, regret-based algorithms are easily implementable in large scale multi-agent systems when compared with other learning algorithms such... |

16 |
Game Theory. MITPress
- Tirole
- 1991
(Show Context)
Citation Context ....e.,thesetofn dimensional probability distributions. —Forx ∈ R n ,[x] + ∈ R n denotes the vector whose ith entryequalsmax(xi, 0). 2. SETUP 2.1 Finite Strategic-Form Games A finite strategic-form game =-=[6]-=- consists of an n-player set P := {P1,...,Pn}, a finite action set Yi for each player Pi ∈ P, and a utility function Ui : Y → R for each player Pi ∈P, where Y := Y1 × ... × Yn. We will henceforth use ... |

16 |
Evolutionary game theory. mit press
- Weibull
- 1995
(Show Context)
Citation Context ...multi-agent systems will lead to weakly acyclic games [17]. 2.5 Repeated Games Here, we deal with the issue of how players can learn to play a pure Nash equilibrium through repeated interactions; see =-=[5, 28, 27, 12, 24, 22]-=-. We assume that each player has access to its own utility function but not to the utility functions of other players. This private utilities assumption is motivated in multi-agent systems by the requ... |

15 | No-regret algorithms for structured prediction problems
- Gordon
- 2005
(Show Context)
Citation Context |

12 | Efficient no-regret multiagent learning
- Banerjee, Peng
- 2005
(Show Context)
Citation Context |

5 |
Regret based continuous-time dynamics
- Hart, Mas-Colell
(Show Context)
Citation Context ...r action in the past steps. It turns out that the average regret of a player using regret matching would asymptotically vanish (similar results hold for different regret based adaptive dynamics); see =-=[8, 9, 10]-=-. This would imply that the empirical frequencies of the action profiles y(k) would almost surely converge to the set of coarse correlated equilibria 1 , where a coarse correlated equilibrium is any p... |

3 |
On Convergence to Nash Equilibria of Regret-Minimizing Algorithms in Routing Games
- Blum, Evan-Dar, et al.
- 2006
(Show Context)
Citation Context ...would have” received had that player used a different fixed strategy at all previous time steps. No-regret algorithms have been proposed in a variety of settings ranging from network routing problems =-=[2]-=- to structured prediction problems [6]. In regret-based algorithms, each player makes a decision using only information regarding the regret for each of his possible actions. If the regret-based algor... |

1 |
Multi-agent learning for engineers. 2006
- Mannor, Shamma
(Show Context)
Citation Context ... general, a pure Nash equilibrium may not exist for an arbitrary game. However, we are interested in engineered multi-agent systems where agent objectives are designed to achieve an overall objective =-=[26, 25, 15]-=-. In an engineered multi-agent system, there may exists a global objective function φ : Y → R that a global planner is seeking to maximize. Furthermore, players’ local utility functions need to be som... |