(Enter summary)
Abstract: Multiagent learning is a key problem in AI. In the presence of multiple
Nash equilibria, even agents with non-conflicting interests may not
be able to learn an optimal coordination policy. The problem is exaccerbated
if the agents do not know the game and independently receive
noisy payoffs. So, multiagent reinforfcement learning involves two interrelated
problems: identifying the game and learning to play. In this paper,
we present optimal adaptive learning, the first algorithm that... (Update)
Similar documents based on text: More All
0.8: Reinforcement Learning to Play an Optimal Nash Equilibrium in .. - Wang, Sandholm (2002)
(Correct)
0.3: Learning Near-Pareto-Optimal Conventions in Polynomial Time - Wang, Sandholm (2003)
(Correct)
0.3: Learning Near-Pareto-Optimal Conventions in - Polynomial Time Xiaofeng
(Correct)
BibTeX entry: (Update)
X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems 15 (NIPS-2002. http://citeseer.ist.psu.edu/article/wang02reinforcement.html More
@misc{ wang02reinforcement,
author = "X. Wang and T. Sandholm",
title = "Reinforcement learning to play an optimal nash equilibrium in team markov
games",
text = "X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash
equilibrium in team markov games. In Advances in Neural Information Processing
Systems 15 (NIPS-2002.",
year = "2002",
url = "citeseer.ist.psu.edu/article/wang02reinforcement.html" }
Citations (may not include all citations):
268
Dynamic programming and Markov processes (context) - Howard - 1960
135
Multi-agent reinforcement learning: independent vs
- Tan - 1993
126
The theory of learning in games (context) - Fudenberg, Levine - 1998
99
Multiagent reinforcement learning: theoretical framework and..
- Hu, Wellman - 1998
85
The dynamics of reinforcement learning in cooperative multi-..
- Claus, Boutilier - 1998
83
Learning to coordinate without sharing information
- Sen, Sekaran et al. - 1994
38
Convergence results for single-step on-policy reinforcement ..
- Singh, Jaakkola et al. - 2000
33
learning and coordination in multi-agent decision processes (context) - Boutilier - 1996
33
Learning to coordinate actions in multi-agent systems (context) - Wei - 1993
24
Markov chain: theory and applications (context) - Isaacson, Madsen - 1976
23
and long run equilibria in games (context) - Kandori, Mailath et al. - 1993
21
Friend-or-Foe Q-learning in general sum game
- Littman - 2001
17
Spieltheoretische behandlung eines oligopolmodells mit nachf.. (context) - Selten - 1965
10
Value-function reinforcement learning in markov games (context) - Littman - 2000
4
Learning in the iterated prisoner's dilemma (context) - Sandholm, Crites - 1995
[Article contains additional citations not shown here]
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://books.nips.cc/nips15.html): More
Learning Attractor Landscapes for Learning Motor Primitives - Ijspeert, Nakanishi, Schaal (2003)
(Correct)
A Statistical Mechanics Approach to Approximate Analytical.. - Malzahn, Opper (2003)
(Correct)
Going Metric: Denoising Pairwise Data - Roth, Laub, Buhmann, Müller (2002)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC