by Hajime Fujita, Yoichiro Matsuno, Shin Ishii
IEEE Trans. Syst., Man. & Cybern
http://hawaii.naist.jp/~hajime-f/papers/IEEE-SMC03-hajime-f.ps
Add To MetaCart
Abstract:
Abstract – We formulate an automatic strategy acquisition problem for the multi-agent card game “Hearts ” as a reinforcement learning (RL) problem. Since there are often a lot of unobservable cards in this game, RL is approximately dealt with in the framework of a partially observable Markov decision process (POMDP). This article presents a POMDP-RL method based on estimation of unobservable state variables and prediction of actions of the opponent agents. Simulation results show our model-based POMDP-RL method is applicable to a realistic multi-agent problem.
Citations
|
408
|
Planning and acting in partially observable stochastic domains
– Kaelbling, Littman, et al.
- 1998
|
|
377
|
Neuronlike adaptive elements that can solve difficult learning control problems
– Barto, Sutton, et al.
- 1983
|
|
239
|
Prioritized sweeping: Reinforcement learning with less data and less real time
– Moore, Atkeson
- 1993
|
|
149
|
TD-Gammon, a self-teaching backgammon program, achieves master-level play
– Tesauro
- 1994
|
|
65
|
Monte Carlo POMDPs
– Thrun
|
|
32
|
On-line em algorithm for the normalized gaussian network
– Sato, Ishii
- 2000
|
|
20
|
GIB: Imperfect information in a computationally challenging game
– Ginsberg
|
|
7
|
Strategy acquisition for the game Othello based on reinforcement learning
– Yoshioka, Ishii, et al.
- 1999
|
|
4
|
A multi-agent reinforcement learning method for a partially-observable competitive game
– Matsuno, Yamazaki, et al.
- 2001
|
|
3
|
Blackjack as a Test bed for Learning Strategies in Neural Networks
– Pèrez-Uribe, Sanchez
- 1998
|
|
3
|
games as a framework for multi-agent reinforcement learning
– Markov
- 1994
|
|
1
|
a self-teaching Backgammon program, achieves master-level play
– TD-Gammon
- 1994
|