| M. Mundhenk, J Goldsmith, and Eric Allender. The complexity of policy evaluation for nite-horizon partially-observable Markov decision processes. In Proc. 22nd Mathematical Foundations of Computer Science, pages 129-138, 1997. |
.... rst work on the analysis of the complexity of MDP problem variants [98, 22, 23] More recently, as research in AI has incorporated uncertainty as an indispensable component of many problems, researchers have further investigated the computational complexity of various MDP problems and algorithms [76, 13, 93, 84]. A paper by Goldsmith and Mundhenk [45] surveys complexity results for many MDP problems and points to new directions. This thesis investigates questions raised in the rst paper by Papadimitriou and Tsitsiklis on complexity of MDPs [98] and which later work (e.g. 23, 89, 76, 84] further ....
....time steps, and in case of UMDPs we will consider both nite and in nite sequence of actions. This contrasts with nite horizon objective, where the decision maker has a speci ed time limit in executing actions. The complexity of nite horizon POMDPs has been extensively studied; see for example [98, 76, 92, 93]. 2.3.3 Measures of Value of Action Sequences In solving UMDPs, we need a measure of value of an action sequence to formulate optimization objectives and computational problems. We will sometimes refer to a choice of value measure as an optimality criterion. Once a value measure is de ned, an ....
M. Mundhenk, J Goldsmith, and Eric Allender. The complexity of policy evaluation for nite-horizon partially-observable Markov decision processes. In Proc. 22nd Mathematical Foundations of Computer Science, pages 129-138, 1997.
....and in case of UMDPs we will consider both nite and in nite sequences of actions. This contrasts with nite horizon objectives, where the decision maker executes a xed and known number of actions. The complexity of nite horizon MDPs and POMDPs has been extensively studied; see for example [40,26,38]. 2.2.3 Measures of Value of Action Sequences In solving planning and MDPs problems, we need a measure of value of an action sequence to formulate optimization objectives and computational problems. We will sometimes refer to a choice of value measure as an optimality criterion. Once a value ....
M. Mundhenk, J Goldsmith, and E. Allender. The complexity of policy evaluation for nite-horizon partially-observable Markov decision processes. In Proc. 22nd Mathematical Foundations of Computer Science, pages 129-138, 1997.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC