See this document in CiteSeerX!

Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games (2002)  (Make Corrections)  (11 citations)
Xiaofeng Wang, Tuomas Sandholm



  Home/Search   Context   Related

 
View or download:
books.nips.cc/papers/files...CN08.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  books.nips.cc/nips15 (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play. In this paper, we present optimal adaptive learning, the first algorithm that... (Update)

Similar documents based on text:   More   All
0.8:   Reinforcement Learning to Play an Optimal Nash Equilibrium in .. - Wang, Sandholm (2002)   (Correct)
0.3:   Learning Near-Pareto-Optimal Conventions in Polynomial Time - Wang, Sandholm (2003)   (Correct)
0.3:   Learning Near-Pareto-Optimal Conventions in - Polynomial Time Xiaofeng   (Correct)

BibTeX entry:   (Update)

X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems 15 (NIPS-2002. http://citeseer.ist.psu.edu/article/wang02reinforcement.html   More

@misc{ wang02reinforcement,
  author = "X. Wang and T. Sandholm",
  title = "Reinforcement learning to play an optimal nash equilibrium in team markov
    games",
  text = "X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash
    equilibrium in team markov games. In Advances in Neural Information Processing
    Systems 15 (NIPS-2002.",
  year = "2002",
  url = "citeseer.ist.psu.edu/article/wang02reinforcement.html" }
Citations (may not include all citations):
268   Dynamic programming and Markov processes (context) - Howard - 1960
135   Multi-agent reinforcement learning: independent vs - Tan - 1993
126   The theory of learning in games (context) - Fudenberg, Levine - 1998
99   Multiagent reinforcement learning: theoretical framework and.. - Hu, Wellman - 1998
85   The dynamics of reinforcement learning in cooperative multi-.. - Claus, Boutilier - 1998
83   Learning to coordinate without sharing information - Sen, Sekaran et al. - 1994
38   Convergence results for single-step on-policy reinforcement .. - Singh, Jaakkola et al. - 2000
33   learning and coordination in multi-agent decision processes (context) - Boutilier - 1996
33   Learning to coordinate actions in multi-agent systems (context) - Wei - 1993
24   Markov chain: theory and applications (context) - Isaacson, Madsen - 1976
23   and long run equilibria in games (context) - Kandori, Mailath et al. - 1993
21   Friend-or-Foe Q-learning in general sum game - Littman - 2001
17   Spieltheoretische behandlung eines oligopolmodells mit nachf.. (context) - Selten - 1965
10   Value-function reinforcement learning in markov games (context) - Littman - 2000
4   Learning in the iterated prisoner's dilemma (context) - Sandholm, Crites - 1995

[Article contains additional citations not shown here]



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://books.nips.cc/nips15.html):   More
Learning Attractor Landscapes for Learning Motor Primitives - Ijspeert, Nakanishi, Schaal (2003)   (Correct)
A Statistical Mechanics Approach to Approximate Analytical.. - Malzahn, Opper (2003)   (Correct)
Going Metric: Denoising Pairwise Data - Roth, Laub, Buhmann, Müller (2002)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC