5 citations found. Retrieving documents...
P. Tseng. Solving H-horizon stationary Markov decision process in time proportional to log(H). Operations Research Letters, 9(5):287--297, 1990.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Polynomial Value Iteration Algorithms for Deterministic MDPs - Omid Madani Department (2002)   (Correct)

....little about their asymptotic complexity. It is known, however, that algorithms based on value iteration have no better than a pseudo polynomial run time on MDP prob An algorithm has pseudo polynomial run time complexity, if it runs in time polynomial in the unary representation of the lems [Tse90, Lit96]. In this paper, we analyze the basic value iteration procedure on the deterministic MDP problem under the average reward criterion, or the DMDP problem, and we establish several positive results. The DMDP problem is also known as the maximum (or minimum) mean cycle problem in a directed weighted ....

P. Tseng. Solving H-horizon stationary Markov decision process in time proportional to log(H). Operations Research Letters, 9(5):287--297, 1990.


Complexity results for Infinite-Horizon Markov Decision Processes - Madani (2000)   (Correct)

....small reward y c . Thus value iteration may take many iterations to converge to an optimal or near optimal policy or bring the value of a vertex to within say a constant factor of its optimal value. This is basically formalized in the statement that value iteration is a pseudo polynomial algorithm [119], meaning that it has a run time polynomial in n, m; and W (versus log W ) see [119, 76] However, as we will see, on MDP(2) problems, value iteration variants or basically a sequence of value propagation or Bellman Ford operations [1] are used in binary search schemes to give polynomial time ....

....or near optimal policy or bring the value of a vertex to within say a constant factor of its optimal value. This is basically formalized in the statement that value iteration is a pseudo polynomial algorithm [119] meaning that it has a run time polynomial in n, m; and W (versus log W ) see [119, 76]) However, as we will see, on MDP(2) problems, value iteration variants or basically a sequence of value propagation or Bellman Ford operations [1] are used in binary search schemes to give polynomial time algorithms for solving the MDP(2) problem. In Chapter 7, value iteration is shown to ....

[Article contains additional citation context not shown here]

P. Tseng. Solving H-horizon stationary Markov decision process in time proportional to log(H). Operations Research Letters, 9(5):287-297, 1990.


On the Undecidability of Probabilistic Planning and.. - Madani, Hanks, Condon (2003)   (2 citations)  (Correct)

....they formally establish NP hardness results for a variety of nite horizon POMDP problems, but only conjecture as to the undecidability of the POMDP in nite horizon case. The computational complexity of nite horizon control problems has received considerable attention recently, see for example [48,10,39,20]. For the in nitehorizon case, two questions, the complexity of goal state reachability with either nonzero probability or probability one, which reduce to reachability computations and are decidable, had been studied by Alur et al. 1] and Littman [26] Other work considered in nite horizon ....

.... case, two questions, the complexity of goal state reachability with either nonzero probability or probability one, which reduce to reachability computations and are decidable, had been studied by Alur et al. 1] and Littman [26] Other work considered in nite horizon fully observable MDPs [40,48], fully observable MDPs with exponentially many states but with compact representations [27,28] and the complexity of in nite horizon problems on stochastic games, which generalize MDPs [14,31] Littman, Goldsmith, and Mundhenk [29] analyze the complexity of propositional probabilistic planning ....

P. Tseng. Solving H-horizon stationary Markov decision process in time proportional to log(H). Operations Research Letters, 9(5):287-297, 1990.


On the Undecidability of Probabilistic Planning and Related.. - Madani, Hanks (2003)   (2 citations)  (Correct)

....in which they formally establish NP hardness results for a variety of finite horizon problems, but only conjecture as to the undecidability of the infinite horizon case. The computational complexity of finite horizon control problems has received considerable attention recently, see for example [32,6,24,14]. For the infinite horizon case, two questions, the complexity of goal state reachability with either nonzero probability or probability one, which reduce to reachability computations and are decidable, had been studied by Alur et al. 1] and Littman [17] It is now well established that optimal ....

P. Tseng. Solving H-horizon stationary Markov decision process in time proportional to log(H). Operations Research Letters, 9(5):287--297, 1990.


Polynomial Value Iteration Algorithms for Deterministic MDPs - Madani (2002)   (Correct)

....little about their asymptotic complexity. It is known, however, that algorithms based on value iteration have no better than a pseudo polynomial run time on MDP prob An algorithm has pseudo polynomial run time complexity, if it runs in time polynomial in the unary representation of the lems [Tse90, Lit96]. In this paper, we analyze the basic value iteration procedure on the deterministic MDP problem under the average reward criterion, or the DMDP problem, and we establish several positive results. The DMDP problem is also known as the maximum (or minimum) mean cycle problem in a directed weighted ....

P. Tseng. Solving H-horizon stationary Markov decision process in time proportional to log(H). Operations Research Letters, 9(5):287--297, 1990.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC