49 citations found. Retrieving documents...
Hauskrecht, M., Meuleau, N., Kaelbling, L. P., Dean, T., & Boutilier, C. (1998). Hierarchical solution of Markov decision processes using macro-actions. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98).

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Decision-Theoretic Control of Planetary Rovers - Zilberstein, Washington.. (2002)   (Correct)

....or actual experience are available. The algorithm fits into the category of hierarchical reinforcement learning (e.g. 32] because it learns simultaneously at the state level and at the subprocess level. We note that other researchers have proposed methods for solving weakly coupled MDPs [16, 20, 26], but very little work has been done in a reinforcement learning context. The hierarchical algorithm has been compared with Q learning; it is shown to perform better initially, but it fails to converge to the optimal policy. Hence, the hierarchical approach could be beneficial in situations in ....

Hauskrecht, M., Meuleau, N., Kaelbling, L.P., Dean, T., Boutilier, C.: Hierarchical Solution of Markov Decision Processes Using Macro-Actions. Fourteenth International Conference on Uncertainty in Artificial Intelligence (1998)


A Hybrid Architecture for Adaptive Robot Control - Huber   (Correct)

....impractical in general for on line learning in complex robot control tasks. Motivated in part by these considerations, the reinforcement learning framework has recently been extended to permit more complex and temporally extended actions in the context of Semi Markov Decision Processes (SMDP) [56, 90, 86, 49, 37]. In a di erent line of work, behavior based control techniques have been used in conjunction with reinforcement learning algorithms to address the issue of learning speed and native reactivity. One approach to utilizing such a combination is to decompose the problem a priori and apply the ....

....Previously acquired control policies in this context are treated as temporally extended actions, reducing the set of points at which the learning agent has to make a new decision. Recent work has extended the basic reinforcement learning framework to allow for the use of such extended actions [90, 86, 49, 37]. Here, the underlying system is modeled as a Semi Markov Decision Problem (SMDP) and abstract actions are formed as policies constructed from primitive actions with associated termination conditions. This permits policies to be constructed in the SMDP framework which could not be formed in a ....

Hauskrecht, Milos, Meuleau, Nicolas, Boutilier, Craig, Kaelbling, Leslie Pack, and Dean, Thomas. Hierarchical solution of markov decision processes using macro-actions. In International Conference on Uncertainty In Arti cial Intelligence (1998).


Learning State Features from Policies to Bias Exploration in.. - Singer, Veloso (1999)   (Correct)

....starts from scratch because each problem may have a different set of states and rewards. This has been recognized as rather unfortunate and several approaches have been and are being investigated to find structure, abstraction, generalization, and or policy reuse in reinforcement learning (e.g. [4, 14, 5, 8, 1, 11]) The work presented in this paper contributes a technique within this line of research. It builds upon the assumption that the solutions to a series of problems contain biasing information about selecting actions to more efficiently solve new problems within the domain. The algorithm presented ....

M. Hauskrecht, N. Meuleau, L. P. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of markov decision processes using macro-actions. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Arti- ficial Intelligence (UAI-98), 1998.


Distributed Planning in Hierarchical Factored MDPs - Guestrin, Gordon (2002)   (Correct)

....Kushner and Chen were the first to apply Dantzig Wolfe decomposition to Markov decision processes, while Dean and Lin combined decomposition with state abstraction. Hierarchical planning algorithms include MAXQ [7] hierarchies of abstract machines [16] and planning with macro operators [22, 9]. By contrast, in a parallel decomposition, multiple subproblems can be active at the same time, and the combined state space is the cross product of the subproblem state spaces. The size of the combined problem is therefore exponential rather than linear in the number of subproblems, which means ....

M. Hauskrecht, N. Meuleau, L. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of Markov decision processes using macro-actions. In UAI-98, 1998.


Planetary Rover Control as a Markov Decision Process - Bernstein, Zilberstein.. (2001)   (Correct)

....depend on each other through the resources. In other words, when the rover decides to leave a site, the function that determines the next state only takes as input the time remaining, battery capacity, and data storage. This type of structure is reminiscent of work involving weakly coupled MDPs [5, 6]. One way to exploit this structure is to have two value functions, a highlevel function that only takes resource levels as input and is only updated during transitions between sites, and a low level function that can take any of the state components as input and is updated at every time step. We ....

M. Hauskrecht, N. Meuleau, L. P. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of Markov decision processes using macroactions. In Proceedings of the Fourteenth Conference on Uncertainty in Articial Intelligence, 1998.


Decision-Theoretic Control of Planetary Rovers - Zilberstein, Washington.. (2002)   (Correct)

....or actual experience are available. The algorithm fits into the category of hierarchical reinforcement learning (e.g. 28] because it learns simultaneously at the state level and at the subprocess level. We note that other researchers have proposed methods for solving weakly coupled MDPs [15, 18, 23], but very little work has been done in a reinforcement learning context. The hierarchical algorithm has been compared with Q learning; it is shown to perform better initially, but it fails to converge to the optimal policy. A third algorithm which is given the optimal values for the bottleneck ....

M. Hauskrecht, N. Meuleau, L.P. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of Markov decision processes using macro-actions. Fourteenth International Conference on Uncertainty in Artificial Intelligence, 1998.


Existence of Multiagent Equilibria with Limited Agents - Bowling, Veloso (2002)   (1 citation)  (Correct)

....moving the ball down the field is a good heuristic for goal progression, but at times the optimal goal scoring policy is to pass the ball backwards to an open teammate. Subproblem reuse also has a similar effect, where a subgoal is used in a portion of the state space to speed learning, e.g. [Hauskrecht et al. 1998; Bowling and Veloso, 1999] These subgoals, though, may not be optimal for the global problem and so prevent the agent from playing optimally. Parameterized policies are receiving a great deal of attention as a way for reinforcement learning to scale to large problems, e.g. Sutton et al. ....

M. Hauskrecht, N. Meuleau, L. P. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of Markov decision processes using macro-actions. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98), 1998.


More Realistic Human Behavior Models for Agents in Virtual.. - Silverman (2001)   (1 citation)  (Correct)

....behavior is that of believability of agents. The basic premise is that characters should appear to be alive, to think broadly , to react emotionally and with personality to appropriate circumstances. There is a growing graphics and animated agent literature on the believability topic (e.g. see [6 19, 55 58]) and much of this work focuses on using great personality to mask the lack of deeper reasoning ability. However, in this paper we are less interested in the kinesthetics, media and broadly appealing personalities, than we are in the planning, judging, and choosing types of July 2001, Systems ....

....toward the utility maximum in any given step. Emotional construals on the other hand can redefine where the NE occur in the payoff table or they can redefine the entire game. That is, an emotive agent can recognize a meta game and shift the play to a higher level of systemic interaction: e.g. see [3 4, 45 58, 64 65]. An example might be a terrorist who martyrs himself rather than be caught, and by that catalyzes his cause. 1.2) Meta Games and Interactive Dramas In general, this research focuses on games that have multiple stages, each of which may involve expansion to part of a higher level, or ....

[Article contains additional citation context not shown here]

Hauskrecht, M, et al., "Hierarchical Solution of Markov Decision Processes using Macro-Actions," B July 2001, Systems Engineering Department University of Pennsylvania 33


A Clustering Approach to Solving Large Stochastic.. - Milos Hauskrecht.. (2001)   Self-citation (Hauskrecht)   (Correct)

No context found.

Milos Hauskrecht, Nicolas Meuleau, Craig Boutilier, Leslie Pack Kaelbling, and Thomas Dean. Hierarchical solution of Markov decision processes using macro-actions. In Proceedings of the Fourteenth pages 220--229, Madison, WI, 1998.


Efficient Methods For Computing Investment Strategies For.. - Hauskrecht, Al. (2001)   Self-citation (Hauskrecht)   (Correct)

No context found.

Hauskrecht, M., C. Boutilier, N. Meuleau, L. Kaelbling, and T. Dean. 1998. Hierarchical solution of Markov decision processes using macro-actions. In Proc. 14th Conference on Uncertainty in Arti cial Intelligence, pp. 220 229.


Approximate Solutions to Factored Markov Decision Processes via - Greedy Search In   Self-citation (Meuleau Dean)   (Correct)

No context found.

Hauskrecht, M.; Meuleau, N.; Boutilier, C.; Kaelbling, L. P.; and Dean, T. 1998. Hierarchical solution of markov decision processes using macro-actions. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, 220--229.


Journal of Artificial Intelligence Research 22 (2004).. - Michael Bowling Bowling   (Correct)

No context found.

Hauskrecht, M., Meuleau, N., Kaelbling, L. P., Dean, T., & Boutilier, C. (1998). Hierarchical solution of Markov decision processes using macro-actions. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98).


Existence of Multiagent Equilibria with Limited Agents - Michael Bowling Mhb   (Correct)

No context found.

Hauskrecht, M., Meuleau, N., Kaelbling, L. P., Dean, T., & Boutilier, C. (1998). Hierarchical solution of Markov decision processes using macro-actions. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98).


Journal of Machine Learning Research 7 (2006) 2259-2301.. - Anders Jonsson Anders   (Correct)

No context found.

M. Hauskrecht, N. Meuleau, L. Kaelbling, T. Dean, and C. Boutilier. Hierarchical Solution of Markov Decision Processes using Macro-actions. Uncertainty in Artificial Intelligence, 14:220-- 229, 1998.


Probabilistic Inference for Solving Discrete and.. - Markov Decision Processes   (Correct)

No context found.

Hauskrecht, M., Meuleau, N., Kaelbling, L. P., Dean, T., & Boutilier, C. (1998). Hierarchical solution of Markov decision processes using macro-actions. Proc. of Uncertainty in Artificial Intelligence (UAI 1998) (pp. 220-- 229).


A Multisine Approach for Trajectory Optimization.. - Mihaylova, De.. (2003)   (Correct)

No context found.

M. Hauskrecht, N. Meuleau, C. Boutilier, L.P. Kaelbling, T. Dean, Hierarchical solution of Markov decision processes using macro-actions, in: Proceedings of the 14th International Conference on Uncertainty in Artificial Intelligence, Madison, WI, 1998, pp. 220--229.


Hierarchies of Probabilistic Models of Space for Mobile.. - Diard, Bessiere, Mazer (2003)   (Correct)

No context found.

Milos Hauskrecht, Nicolas Meuleau, Leslie Pack Kaelbling, Thomas Dean, and Craig Boutilier. Hierarchical solution of Markov decision processes using macro-actions. In Gregory F. Cooper and Seraf n Moral, editors, Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI-98), pages 220--229, San Francisco, July, 24--26 1998. Morgan Kaufmann.


Hierarchies of Probabilistic Models of Navigation: the.. - Diard, Bessiere, Mazer (2004)   (Correct)

No context found.

M. Hauskrecht, N. Meuleau, L. P. Kaelbling, T. Dean, and C. Boutilier, "Hierarchical solution of Markov decision processes using macroactions, " in Proceedings of the 14th Conf. on Uncertainty in Artificial Intelligence (UAI-98), G. F. Cooper and S. Moral, Eds. San Francisco: Morgan Kaufmann, July, 24--26 1998, pp. 220--229.


Combining Probabilistic Models of Space for Mobile.. - Diard, Bessiere, Mazer (2003)   (Correct)

No context found.

M. Hauskrecht, N. Meuleau, L. P. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of Markov decision processes using macro-actions. In G. F. Cooper and S. Moral, editors, Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI-98), pages 220--229, San Francisco, July, 24--26 1998. Morgan Kaufmann.


Algorithms for Partially Observable Markov Decision Processes - Zhang (2001)   (Correct)

No context found.

M. Hauskrecht, N. Meuleau, C. Boutilier, L. P. Kaelbling, and T. Dean, "Hierarchical solution of Markov decision processes using macro-actions," 178 in 14th Conference on Uncertainty in Artificial Intelligence(UAI), pp. 220-- 229, 1998.


Modular Self-Organization for a Long-Living Autonomous Agent - Bruno Scherrer Scherrer (2003)   (Correct)

No context found.

M. Hauskrecht, N. Meuleau, L. P. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of Markov Decision Processes using macro-actions. In Uncertainty in Artificial Intelligence, pages 220--229, 1998.


Modular Self-Organization for a Long-Living Autonomous Agent - Bruno Scherrer Scherrer (2003)   (Correct)

No context found.

M. Hauskrecht, N. Meuleau, L. P. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of Markov Decision Processes using macro-actions. In Uncertainty in Artificial Intelligence, pages 220--229, 1998.


Multi-Level Multi-Perspective Reasoning - Suman Sundaresh Peter   (Correct)

No context found.

Milos Hauskrecht, Nicolas Meuleau, Craig Boutilier, Leslie Pack Kaelbling, and Thomas Dean. Hierarchical solution of markov decision processes using macro-actions. In Uncertainty in Artificial Intelligence: Proceedings of the Fourteenth Conference, 1998. Morgan Kaufmann.


Generalizing Plans to New Environments in Relational MDPs - Guestrin, Koller.. (2003)   (3 citations)  (Correct)

No context found.

M. Hauskrecht, N. Meuleau, L. Kaelbling, T. Dean, and C. Boutilier. Hierarchical solution of Markov decision processes using macro-actions. In UAI, 1998.


Reinforcement Learning for Weakly-Coupled MDPs and an.. - Bernstein, Zilberstein (2001)   (Correct)

No context found.

Hauskrecht, M., Meuleau, N., Kaelbling, L. P., Dean, T. & Boutilier, C. (1998). Hierarchical solution of Markov decision processes using macro-actions. In UAI-98.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC