7 citations found. Retrieving documents...
V. B. Zubek and T. G. Dietterich, "A POMDP approximation algorithm that anticipates the need to observe," in Proceedings of PRICAI-2000.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Predictive Autonomous Robot Navigation - Foka (2002)   (1 citation)  (Correct)

....also the basis for many other techniques for solving MDP s and POMDP s. One of the most well known and commonly used method for approximating the optimal policy of a POMDP, based on value iteration, is the Witness algorithm [17] Other value function approximation algorithms have been proposed in [36, 55, 62, 63, 65] . Another class of methods for solving POMDPs use heuristic control strategies. The belief replanning [5, 32] algorithm starts by assuming it is in the 5 most likely state and generates a path to the goal according to that state. It also generates a sequence of predicted states that the robot ....

Valentina Bayer Zubeck and Thomas G. Dietterich. A POMDP approximation algorithm that anticipates the need to observe. In Pacific Rim International Conference on Artificial Intelligence, pages 521--532, 2000.


Speeding Up the Convergence of Value Iteration in Partially.. - Zhang, Zhang (2001)   (11 citations)  (Correct)

....#MDP#. The latter is a function over the state space. So V # is being approximated by one vector. Littman et al. #1995b# extend this idea and approximate V # using jAj vectors, each of which corresponds to a Q function of the underlying MDP. A further extension is recently introduced by Zubek and Dietterich #2000#. Their idea is to base the approximation not on the underlying MDP, rather on a so called even odd POMDP that is identical to the original POMDP except that the state is fully observable at even time steps. Platzman #1980# suggests approximating V # using the value functions of one or more ....

Zubek, V. B. and Dietterich, T. G.#2000#. A POMDP approximation algorithm that anticipates the need to observe. To appear in Proceedings of the Paci#c Rim Conference on Arti#cial Intelligence #PRICAI-2000#, Lecture Notes in Computer Science, New York: Springer-Verlag.


Speeding Up the Convergence of Value Iteration in Partially.. - Zhang, Zhang (2001)   (11 citations)  (Correct)

....process (MDP) The latter is a function over the state space. So V is being approximated by one vector. Littman et al. 1995b) extend this idea and approximate V using jAj vectors, each of which corresponds to a Q function of the underlying MDP. A further extension is recently introduced by Zubek and Dietterich (2000). Their idea is to base the approximation not on the underlying MDP, rather on a so called even odd POMDP that is identical to the original POMDP except that the state is fully observable at even time steps. Platzman (1980) suggests approximating V using the value functions of one or more fixed ....

Zubek, V. B. and Dietterich, T. G.(2000). A POMDP approximation algorithm that anticipates the need to observe. To appear in Proceedings of the Pacific Rim Conference on Artificial Intelligence (PRICAI-2000), Lecture Notes in Computer Science, New York: Springer-Verlag.


Two Heuristics for Solving POMDPs Having a Delayed Need to.. - Zubek, Dietterich   Self-citation (Zubek Dietterich)   (Correct)

....about the state of the world, particularly when those observation actions are needed at some point in the future. This paper proposes two heuristics that are better than the MDP approximation in POMDPs where there is a delayed need to observe. The first approximation, introduced in [ 2 ] is the even odd POMDP, in which the world is assumed to be fully observable every other time step. The even odd POMDP can be converted into an equivalent MDP, the even MDP, whose value function captures some of the sensing costs of the original POMDP. An online policy, consisting in a ....

....a certain path. This paper presents two approximate solutions that work well when there is a delayed need to observe. Because the methods involve solving MDPs with the same number of states as the original POMDP, they scale well to large POMDPs. The first approximation, presented in detail in [ 2 ] defines a new POMDP, the even odd POMDP, in which the full state of the environment is observable (for no cost) at all times t where t is even. When t is odd, the environment returns the same observation information as in the original POMDP (Figure 2) We showed that the evenodd POMDP can be ....

[Article contains additional citation context not shown here]

Bayer Zubek, V., Dietterich, T.: A POMDP Approximation Algorithm that Anticipates the Need to Observe. PRICAI 2000, LNAI 1886 (2000) 521--532 http://www.cs.orst.edu/bayer/


Two Heuristics for Solving POMDPs Having a Delayed Need to.. - Zubek, Dietterich   Self-citation (Bayer Dietterich)   (Correct)

.... number of world actions taken between observation actions (see [ 5 ] 6 ] and also Sven Koenig s extension to sensor planning in [ 9 ] The chain MDP algorithm is a heuristic for approximately solving COMDPs, starting with the MDP underlying the POMDP, and constructing a sequence of MDPs M 1 ; M 2 ; whose reward functions have been modified to incorporate sensing costs. The sensing costs are esti Sense) Act, Sense) Act, t = 0 observable fully t = 1 t = 2 t = 3 t = 4 observable fully observable fully one observation one observation Act Act Figure 2: The ....

....for approximately solving COMDPs, starting with the MDP underlying the POMDP, and constructing a sequence of MDPs M 1 ; M 2 ; whose reward functions have been modified to incorporate sensing costs. The sensing costs are esti Sense) Act, Sense) Act, t = 0 observable fully t = 1 t = 2 t = 3 t = 4 observable fully observable fully one observation one observation Act Act Figure 2: The even odd POMDP fully observes the state (for free) every other time step mated by performing a 2 step lookahead in each state, and evaluating the leaf nodes using the optimal value ....

[Article contains additional citation context not shown here]

Bayer, V., Dietterich, T.: A POMDP Approximation Algorithm that Anticipates the Need to Observe. Technical Report 00-30-01, Oregon State University, Dept. of Computer Science (2000)


A POMDP Approximation Algorithm that Anticipates the Need to.. - Zubek, Dietterich (2000)   (4 citations)  Self-citation (Bayer Dietterich)   (Correct)

....monotonic, the inequality is true when we apply h 2MDP to both sides: h 2 2MDP V MDP (s) h 2MDP V MDP (s) By induction, h k 2MDP V MDP (s) V MDP (s) for all k. lim k 1 h k 2MDP V MDP = V 2MDP V MDP . Q.E.D. Theorem 2. V POMDP (s) V 2MDP (s) for all s 2 S. see [1] for proof. These two theorems establish that V 2MDP is a better approximation to V POMDP than V MDP on pure belief states. We extend this result to arbitrary belief states b by considering a 2 step lookahead process. Let LA(n) be an operator defined such that LA(n)V (b) estimates the ....

....the fully observable leaf states using V . For example, LA(1) can be written LA(1)V (b) max a ( P s b(s) P s 0 P tr (s 0 js; a) R(s 0 js; a) flV (s 0 ) Theorem 3. For all belief states b, V POMDP (b) LA(2)V 2MDP (b) LA(2)V MDP (b) LA(1)V MDP (b) see [1] for proof. Figure 3 depicts this relationship for a 2 state finite horizon POMDP. All value functions are piecewise linear and convex. 3.3 Even MDP Approximation Algorithm We can easily compute V 2MDP offline via value iteration. To generate a policy for the original POMDP, we maintain a ....

Bayer, V., Dietterich, T.: A POMDP Approximation Algorithm that Anticipates the Need to Observe. Technical Report 00-30-01, Oregon State University, Dept. of Computer Science (2000)


Algorithms for Partially Observable Markov Decision Processes - Zhang (2001)   (Correct)

No context found.

V. B. Zubek and T. G. Dietterich, "A POMDP approximation algorithm that anticipates the need to observe," in Proceedings of PRICAI-2000.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC