11 citations found. Retrieving documents...
Wang, X., & Sandholm, T. (2002). Reinforcement learning to play an optimal Nash equilibrium in team Markov games. Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS). Vancouver, Canada.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Coordination in Multiagent Reinforcement Learning: A.. - Chalkiadakis, Boutilier (2003)   (1 citation)  (Correct)

....[4] For obvious reasons, one would like RL methods that converge to desirable (e.g. optimal) equilibria. A number of heuristic exploration strategies have been proposed that in fact increase the probability (or even guarantee) that optimal equilibria are reached in identical interest games [4, 13, 12, 19]. Unfortunately, methods that encourage or force convergence to optimal equilibria often do so at a great cost. Coordination on a good strategy profile often requires exploration in parts of policy space that are very unrewarding. In such a case, the benefits of eventual coordination to an ....

....estimates. Kapetanakis and Kudenko [12] propose a method called FMQ for repeated games that uses the optimistic assumption to bias exploration, much like [4] but in the context of individual learners (that do not have explicit access to the actions performed by other agents) Wang and Sandholm [19] similarly use the optimistic assumption in repeated games to guarantee convergence to an optimal equilibrium. We critique these methods below. 3. A BAYESIAN VIEW OF MARL The spate of activity described above on MARL in identical interest games has focused exclusively on devising methods that ....

X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems 15 (NIPS-2002.


AWESOME: A General Multiagent Learning Algorithm that.. - Conitzer, Sandholm (2006)   (1 citation)  Self-citation (Sandholm)   (Correct)

No context found.

Wang, X., & Sandholm, T. (2002). Reinforcement learning to play an optimal Nash equilibrium in team Markov games. Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS). Vancouver, Canada.


Communication Complexity as a Lower Bound for Learning in Games - Conitzer, Sandholm (2004)   Self-citation (Sandholm)   (Correct)

No context found.

Wang, X., & Sandholm, T. (2002). Reinforcement learning to play an optimal Nash equilibrium in team Markov games. Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS). Vancouver, Canada.


Learning Near-Pareto-Optimal Conventions in - Polynomial Time Xiaofeng   Self-citation (Sandholm)   (Correct)

No context found.

Wang and Sandholm. Reinforcement learning to play an optimal Nash equilibrium in team markov game. In NIPS, 02.


Learning Near-Pareto-Optimal Conventions in Polynomial Time - Wang, Sandholm (2003)   Self-citation (Sandholm)   (Correct)

No context found.

Wang and Sandholm. Reinforcement learning to play an optimal Nash equilibrium in team markov game. In NIPS, 02.


Improving Coordination with Communication in Multi-agent.. - Learning Daniel Szer (2004)   (Correct)

No context found.

Xiaofeng Wang and Tuomas Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems 15 (NIPS'02), 2002.


Multiagent Collaborative Learning for Distributed Business.. - Yutao Guo And (2004)   (Correct)

No context found.

X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Proceedings of the 16th Neural Information Processing Systems: Natural and Synthetic (NIPS) conference, Vancouver, 2002.


Reinforcement Learning of Coordination in Heterogeneous.. - Kapetanakis, Kudenko (2004)   (Correct)

No context found.

X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Proceedings of the 16th Neural Information Processing Systems: Natural and Synthetic conference, Vancouver, Canada, 2002.


Multiagent Reinforcement Learning: Stochastic Games with.. - Chalkiadakis (2003)   (1 citation)  (Correct)

No context found.

X. Wang and T. Sandholm. Reinforcement Learning to Play An Optimal Nash Equilibrium in Team Markov Games. In Advances in Neural Information Processing Systems 15 (NIPS 2002.


Coordination in Multiagent Reinforcement Learning: A.. - Chalkiadakis, Boutilier (2003)   (1 citation)  (Correct)

No context found.

X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems 15 (NIPS-2002.


Countering Deception in Multiagent Reinforcement Learning - Banerjee, Peng   (Correct)

No context found.

Xiaofeng Wang and Tuomas Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems 15, NIPS, 2002.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC