| N. Meuleau, K.-E. Kim, L. P. Kaelbling, and A. R. Cassandra. Solving POMDPs by searching the space of finite policies. In Proceedings of the Fifteenth Conf. on Uncertainty in Artificial Intelligence, pages 417--426, Stockholm, Sweden, 1999. |
....of the hardest problems in P. Work on algorithms for concise POMDPs and AI planning have not used this general a policy representation, but for our purposes this seems like a well founded choice. Related definitions of policies as finite state controllers have been proposed earlier [Hansen, 1998; Meuleau et al. 1999; Lusena et al. 1999] Definition 7 (Concise policy) A concise policy for a concise POMDP M = hI ; O; r; Bi is a tuple hT ; C; vi where T is a Boolean circuit with jBj p input gates and p output gates, C is a Boolean circuit with jBj p input gates and dlog 2 jOje output gates, and v is a ....
Nicolas Meuleau, Kee-Eung Kim, Leslie Pack Kaelbling, and Anthony R. Cassandra. Solving POMDPs by searching the space of finite policies. In Kathryn B. Laskey and Henri Prade, editors, Uncertainty in Artificial Intelligence, Proceedings of the Fifteenth Conference (UAI-99), pages 417--426. MorganKaufmann Publishers, 1999.
....of the hardest problems in P. Work on algorithms for concise POMDPs and AI planning have not used this general a policy representation, but for our purposes this seems like a well founded choice. Related definitions of policies as finite state controllers have been proposed earlier [Hansen, 1998; Meuleau et al. 1999; Lusena et al. 1999] Definition 7 (Concise policy) A concise policy for a concise POMDP M = hI ; O; r; Bi is a tuple hT ; C; vi where T is a Boolean circuit with jBj p input gates and p output gates, C is a Boolean circuit with jBj p input gates and dlog 2 jOje output gates, and v is a ....
Nicolas Meuleau, Kee-Eung Kim, Leslie Pack Kaelbling, and Anthony R. Cassandra. Solving POMDPs by searching the space of finite policies. In Kathryn B. Laskey and Henri Prade, editors, Uncertainty in Artificial Intelligence, Proceedings of the Fifteenth Conference (UAI-99), pages 417--426. MorganKaufmann Publishers, 1999.
....many variants of POMDP problems for which it had not been proved that finding tractable exact solutions or provably good approximations is hard. There is a growing literature on heuristic solutions for POMDP (see [Cassandra 1998; Hansen 1998b; Hauskrecht 1997; Lovejoy 1991; Lusena et al. 1999; Meuleau et al. 1999; Peshkin et al. 1999; Platzman 1977; Smallwood and Sondik 1973] for instance. Since these algorithms do not yield guaranteed optimal or near optimal solutions, we leave a discussion of them to other sources. In this paper, we address the computational complexity, given a process and ....
Meuleau, N., Kim, K.-E., Kaelbling, L. P., and Cassandra, A. R. 1999. Solving POMDPs by searching the space of finite policies. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (1999), pp. 417--426.
....is, fortunately, a compromise. In this work, we consider finite memory policies. These policies were introduced in Sondik s thesis (Sondik, 1971) but the most general form of finite memory policies has received little attention until recently. See, for instance, Hansen, 1998a; Hansen, 1998b; Meuleau et al. 1999). Finite memory can be used to record the last k states seen; this restriction, finite history policies, was extensively explored in the 70 s and 80 s (Lovejoy, 1991) Memory could instead be used to record the time the system has run (for finite horizon problems) yielding time dependent ....
....policies. However, in many instances (including those we discuss in Section 6.1) finite memory policies out perform the optimal stationary policy significantly. Meuleau et al. have applied search heuristics to finding good finitestate controllers of a fixed size in a learning theoretic context (Meuleau et al. 1999). Their work is fairly similar to ours, but our preliminary results seem to be better, either because we assume knowledge of the model or because we are working with a better update heuristic in our local search algorithms. A major question that this work begins to explore is, How much finite ....
[Article contains additional citation context not shown here]
Meuleau, Nicolas, Kim, Kee-Eung, Hauskrecht, Milos, Cassandra, Anthony R., & Kaelbling, Leslie P. 1999. Solving POMDPs by Searching the Space of Finite Policies. submitted.
....One alternative to value based learning is direct policy search [24, 12] which is less a#ected by problems of partial observability but inherits all the problems that come with local search. It has been applied to learning policies that are expressed as stochastic finite state controllers [18], which might work well in the blocks world domain. These methods are appropriate when the parametric form of the policy is reasonably well known a priori, but probably do not scale to very large, open ended environments. Another strategy is to apply the pomdp framework more directly and learn a ....
Nicolas Meuleau, Kee-Eung Kim, Leslie Pack Kaelbling, and Anthony R. Cassandra. Solving POMDPs by searching the space of finite policies. (manuscript, submitted to UAI99), 1999.
....of the optimal Bayesian solution. Therefore, they need to work explicitly in the continuous space of belief functions, which is a cumbersome and sometimes intractable process. Another approach uses EM to find a finite controller that is optimal over a finite horizon [12] In a companion paper [22], we proposed to solve problems with a very large state space by fixing the size of the policy graph and trying to find the best graph of this size. We may then hope to find a graph size that realizes a good compromise between the quality of the solution and the time required for finding it. This ....
....performing all the computation in a discrete setting, like in completely observable Markov decision processes (MDPs) 13, 24] 2 However, the algorithms do not provide any evaluation of the quality of the solution produced relative to the optimal performance. As we showed in the companion paper [22], finding the best finite policy graph of a given size is NP hard. However, some classical optimization techniques such as branch andbound search and gradient descent can be accelerated using previous knowledge about the structure of the problem at hand and its optimal solution. Despite this ....
[Article contains additional citation context not shown here]
N. Meuleau, K.E. Kim, L.P. Kaelbling, and A.R. Cassandra. Solving POMDPs by searching the space of finite policies. Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, To appear, 1999.
No context found.
N. Meuleau, K.-E. Kim, L. P. Kaelbling, and A. R. Cassandra. Solving POMDPs by searching the space of finite policies. In Proceedings of the Fifteenth Conf. on Uncertainty in Artificial Intelligence, pages 417--426, Stockholm, Sweden, 1999.
No context found.
Meuleau, N., K.-E. Kim, L. P. Kaelbling, & A. R. Cassandra (1999). Solving POMDPs by searching the space of finite policies. In Proceedings of the Fifteenth International Conference on Uncertainty in Artificial Intelligence.
No context found.
N. Meuleau, K.-E. Kim, L. P. Kaelbling, and A. R. Cassandra. Solving POMDPs by searching the space of finite policies. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pages 417--426, Stockholm, 1999.
No context found.
Nicolas Meuleau, Kee-Eung Kim, Leslie Pack Kaelbling, and Anthony Cassandra. Solving pomdps by searching the space of finite policies. In Proceedings of the 15th Annual 1999.
No context found.
N. Meuleau, K. E. Kim, L. Kaelbling, and A. Cassandra. Solving pomdps by searching the space of finite policies. In Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence (UAI-99), pages 417--426, San Francisco, CA, 1999. Morgan Kaufmann Publishers.
No context found.
Nicolas Meuleau, Kee-Eung Kim, Leslie Pack Kaelbling, and Anthony Cassandra. Solving POMDPs by Searching the Space of Finite Policies. In Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence, 1999.
No context found.
N. Meuleau, K. Kim, L. P. Kaelbling, and A. R. Cassandra, "Solving POMDPs by searching the space of finite policies," in Proceedings of the Fifteenth International Conference on Uncertainty in Artificial Intelligence (UAI), 1999.
No context found.
N. Meuleau, K.-E. Kim, L. P. Kaelbling, and A. R. Cassandra. Solving POMDPs by searching the space of finite policies. Proc. UAI-99, pp.417--426, Stockholm, 1999.
No context found.
Nicolas Meuleau, Kee-Eung Kim, Leslie Pack Kaelbling, and Anthony R. Cassandra. Solving POMDPs by searching the space of finite policies. In Proc. of UAI-99, pages 417--426, Stockholm, 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC