4 citations found. Retrieving documents...
S. Choi, D. Yeung, and N. Zhang. Hidden-mode Markov decision processes. In IJCAI Workshop on Neural, Symbolic, and Reinforcement Methods for Sequence Learning, 1999.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Solving Hidden-Mode Markov Decision Problems - Choi, Zhang, al. (2001)   Self-citation (Choi Yeung Zhang)   (Correct)

....improvement techniques, a value iteration algorithm is also implemented. Empirical results show that the HM MDP approach outperforms the POMDP one several order of magnitude with respect to both space requirement and speed. 1 INTRODUCTION Hidden mode Markov decision processes (HM MDPs) [5, 4] are a novel mathematical framework for a subclass of nonstationary reinforcement learning problems. Unlike traditional nonstationary reinforcement learning in which a slowly varying environment dynamics is assumed, HM MDPs are for a specific type of nonstationary problems where environmental ....

....be converted into POMDPs with an augmented state space. While POMDPs are superior in terms of representational power, HM MDPs require fewer parameters, and is a more natural formulation for certain types of nonstationary problems. This simplification has shown significant speedup in model learning [5, 4]. This paper considers how HM MDP problems can be solved efficiently. Obviously, a straightforward approach is to convert an HM MDP problem into a POMDP one and then applies existing POMDP algorithms to find a solution. Nevertheless, POMDP algorithms do not exploit the special structures of ....

S. P. M. Choi, D. Y. Yeung, and N. L. Zhang. Hidden-Mode Markov decision processes. In IJCAI 99 Workshop on Neural, Symbolic, and Reinforcement Methods for Sequence Learning, 1999.


An Environment Model for Nonstationary Reinforcement Learning - Choi, Yeung, Zhang (2000)   Self-citation (Choi Yeung Zhang)   (Correct)

....on line RL algorithms can be employed to keep track of the changes. The online approach is memoryless in the sense that even if the environment ever reverts to the previously learned dynamics, learning must still start all over again. 1. 1 Our Proposed Model This paper proposes a formal model [1] for the nonstationary environments that repeat their dynamics in certain ways. Our model is inspired by the observations from the real world nonstationary tasks with the following properties: Property 1. Environmental changes are confined to a small number of modes, which are stationary ....

....to decrease as the data size increases. Larger models have also been tested. While HM MDP Baum Welch is able to learn models with several hundred states and a few modes, POMDP Baum Welch was unable to complete the learning in a reasonable time. Additional experimental results can be found in [1]. 5 Discussions and Future Work The usefulness of a model depends on the validity of the assumptions made. We now discuss the assumptions of HM MDP, and shed some light on its applicability to real world nonstationary tasks. Some possible extensions are also discussed. Modeling a nonstationary ....

S. P. M. Choi, D. Y. Yeung, and N. L. Zhang. Hidden-mode Markov decision processes. In IJCAI 99 Workshop on Neural, Symbolic, and Reinforcement Methods for Sequence Learning, 1999.


All Learning is Local: Multi-agent learning in global reward .. - Chang, Ho, Kaelbling   (Correct)

No context found.

S. Choi, D. Yeung, and N. Zhang. Hidden-mode Markov decision processes. In IJCAI Workshop on Neural, Symbolic, and Reinforcement Methods for Sequence Learning, 1999.


All Learning is Local: Multi-agent learning in global reward .. - Chang, Ho, Kaelbling (2003)   (Correct)

No context found.

S. Choi, D. Yeung, and N. Zhang. Hidden-mode Markov decision processes. In IJCAI Workshop on Neural, Symbolic, and Reinforcement Methods for Sequence Learning, 1999.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC