• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A fast analytical algorithm for solving markov decision processes with real-valued resources (2007)

by J Marecki, S Koenig, M Tambe
Venue:In Proc. of IJCAI
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 15
Next 10 →

A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains

by Nicolas Meuleau, Emmanuel Benazera, Ronen I. Brafman, Eric A. Hansen
"... We consider the problem of optimal planning in stochastic domains with resource constraints, where the resources are continuous and the choice of action at each step depends on resource availability. We introduce the HAO * algorithm, a generalization of the AO * algorithm that performs search in a h ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
We consider the problem of optimal planning in stochastic domains with resource constraints, where the resources are continuous and the choice of action at each step depends on resource availability. We introduce the HAO * algorithm, a generalization of the AO * algorithm that performs search in a hybrid state space that is modeled using both discrete and continuous state variables, where the continuous variables represent monotonic resources. Like other heuristic search algorithms, HAO * leverages knowledge of the start state and an admissible heuristic to focus computational effort on those parts of the state space that could be reached from the start state by following an optimal policy. We show that this approach is especially effective when resource constraints limit how much of the state space is reachable. Experimental results demonstrate its effectiveness in the domain that motivates our research: automated planning for planetary exploration rovers. 1.

Symbolic dynamic programming for discrete and continuous state mdps

by Scott Sanner, Karina Valdivia Delgado, Leliane Nunes De Barros - In UAI-2011 , 2011
"... Many real-world decision-theoretic planning problems can be naturally modeled with discrete and continuous state Markov decision processes (DC-MDPs). While previous work has addressed automated decision-theoretic planning for DC-MDPs, optimal solutions have only been defined so far for limited setti ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
Many real-world decision-theoretic planning problems can be naturally modeled with discrete and continuous state Markov decision processes (DC-MDPs). While previous work has addressed automated decision-theoretic planning for DC-MDPs, optimal solutions have only been defined so far for limited settings, e.g., DC-MDPs having hyper-rectangular piecewise linear value functions. In this work, we extend symbolic dynamic programming (SDP) techniques to provide optimal solutions for a vastly expanded class of DC-MDPs. To address the inherent combinatorial aspects of SDP, we introduce the XADD — a continuous variable extension of the algebraic decision diagram (ADD) — that maintains compact representations of the exact value function. Empirically, we demonstrate an implementation of SDP with XADDs on various DC-MDPs, showing the first optimal automated solutions to DC-MDPs with linear and nonlinear piecewise partitioned value functions and showing the advantages of constraint-based pruning for XADDs. 1

Strategic Advice Provision in Repeated Human-Agent Interactions

by Amos Azaria, Zinovi Rabinovich, Sarit Kraus, Claudia V. Goldman
"... This paper addresses the problem of automated advice provision in settings that involve repeated interactions between people and computer agents. This problem arises in many real world applications such as route selection systems and office assistants. To succeed in such settings agents must reason ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
This paper addresses the problem of automated advice provision in settings that involve repeated interactions between people and computer agents. This problem arises in many real world applications such as route selection systems and office assistants. To succeed in such settings agents must reason about how their actions in the present influence people’s future actions. This work models such settings as a family of repeated bilateral games of incomplete information called “choice selection processes”, in which players may share certain goals, but are essentially self-interested. The paper describes several possible models of human behavior that were inspired by behavioral economic theories of people’s play in repeated interactions. These models were incorporated into several agent designs to repeatedly generate offers to people playing the game. These agents were evaluated in extensive empirical investigations including hundreds of subjects that interacted with computers in different choice selections processes. The results revealed that an agent that combined a hyperbolic discounting model of human behavior with a social utility function was able to outperform alternative agent designs, including an agent that approximated the optimal strategy using continuous MDPs and an agent using epsilongreedy strategies to describe people’s behavior. We show that this approach was able to generalize to new people as well as choice selection processes that were not used for training. Our results demonstrate that combining computational approaches with behavioral economics models of people in repeated interactions facilitates the design of advice provision strategies for a large class of real-world settings.

Planning in Hybrid Structured Stochastic Domains

by Branislav Kveton , 2006
"... Efficient representations and solutions for large structured decision problems with continuous and discrete variables are among the important challenges faced by the designers of automated decision support systems. In this work, we describe a novel hybrid factored Markov decision process (MDP) mod ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Efficient representations and solutions for large structured decision problems with continuous and discrete variables are among the important challenges faced by the designers of automated decision support systems. In this work, we describe a novel hybrid factored Markov decision process (MDP) model that allows for a compact representation of these problems, and a hybrid approximate linear programming (HALP) framework that permits their efficient solutions. The central idea of HALP is to approximate the optimal value function of an MDP by a linear combination of basis functions and optimize its weights by linear programming. We study both theoretical and practical aspects of this approach, and demonstrate its scale-up potential on several hybrid optimization problems

Stochastic Model Predictive Control of Time-Variant Nonlinear Systems with Imperfect State Information

by Florian Weissel, Thomas Schreiter, Marco F. Huber, Uwe D. Hanebeck
"... Abstract — In many technical systems, the system state, which is to be controlled, is not directly accessible, but has to be estimated from observations. Furthermore, the uncertainties arising from this procedure are typically neglected in the controller. To remedy this deficiency, in this paper, we ..."
Abstract - Add to MetaCart
Abstract — In many technical systems, the system state, which is to be controlled, is not directly accessible, but has to be estimated from observations. Furthermore, the uncertainties arising from this procedure are typically neglected in the controller. To remedy this deficiency, in this paper, we present a novel approach to stochastic nonlinear model predictive control (NMPC) for heavily noise-affected systems with not directly accessible, i.e., hidden states, extending the stochastic NMPCframework presented in [1]. An important property of our novel method is that, in contrast to classical approaches, time-variant system and measurement equations as well as time-variant step rewards can be considered. Extending the techniques from [1] by introducing virtual future observations and combining this with a novel tree search algorithm, called probabilistic branch-and-bound search (PBAB), a solution with a feasible computational demand of the challenging problem is possible. I.

Stochastic Optimal Control based on Value-Function Approximation using Sinc Interpolation

by Florian Weissel, Marco F. Huber, Dietrich Brunn, Uwe D. Hanebeck
"... Abstract: An efficient approach for solving stochastic optimal control problems is to employ dynamic programming (DP). For continuous-valued nonlinear systems, the corresponding DP recursion generally cannot be solved in closed form. Thus, a typical approach is to discretize the DP value functions i ..."
Abstract - Add to MetaCart
Abstract: An efficient approach for solving stochastic optimal control problems is to employ dynamic programming (DP). For continuous-valued nonlinear systems, the corresponding DP recursion generally cannot be solved in closed form. Thus, a typical approach is to discretize the DP value functions in order to be able to carry out the calculation. Especially for multidimensional systems, either a large number of discretization points is necessary or the quality of approximation degrades. This problem can be alleviated by interpolating the discretized value function. In this paper, we present an approach based on optimal low-pass interpolation employing sinc functions (sine cardinal). For the important case of systems with Gaussian mixture noise (including the special case of Gaussian noise), we show how the calculations required for this approach, especially the nontrivial calculation of an expected value of a Gaussian mixture random variable transformed by a sinc function, can be carried out analytically. We illustrate the effectiveness of the proposed interpolation scheme by an example from the field of Stochastic Nonlinear Model Predictive Control (SNMPC). 1.

Toward Human-Multiagent Teams

by Nathan Schurr , 2007
"... ..."
Abstract - Add to MetaCart
Abstract not found

(Short Paper)

by Nathan Schurr, Janusz Marecki, Milind Tambe
"... robust approach to adjustable autonomy for ..."
Abstract - Add to MetaCart
robust approach to adjustable autonomy for

Improving Adjustable Autonomy Strategies for Time-Critical Domains

by Nathan Schurr, Aptima Inc, Janusz Marecki, Milind Tambe
"... As agents begin to perform complex tasks alongside humans as collaborative teammates, it becomes crucial that the resulting humanmultiagent teams adapt to time-critical domains. In such domains, adjustable autonomy has proven useful by allowing for a dynamic transfer of control of decision making be ..."
Abstract - Add to MetaCart
As agents begin to perform complex tasks alongside humans as collaborative teammates, it becomes crucial that the resulting humanmultiagent teams adapt to time-critical domains. In such domains, adjustable autonomy has proven useful by allowing for a dynamic transfer of control of decision making between human and agents. However, existing adjustable autonomy algorithms commonly discretize time, which not only results in high algorithm runtimes but also translates into inaccurate transfer of control policies. In addition, existing techniques fail to address decision making inconsistencies often encountered in human multiagent decision making. To address these limitations, we present novel approach for Resolving Inconsistencies in Adjustable Autonomy in Continuous Time (RIAACT) that makes three contributions: First, we apply continuous time planning paradigm to adjustable autonomy, resulting in high-accuracy transfer of control policies. Second, our new adjustable autonomy framework both models and plans for the resolving of inconsistencies between human and agent decisions. Third, we introduce a new model, Interruptible Action Time-dependent Markov Decision Problem (IA-TMDP), which allows for actions to be interrupted at any point in continuous time. We show how to solve IA-TMDPs efficiently and leverage them to plan for the resolving of inconsistencies in RIAACT. Furthermore, these contributions have been realized and evaluated in a complex disaster response simulation system. 1.

Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Continuous Time Planning for Multiagent Teams with Temporal Constraints

by Zhengyu Yin, Milind Tambe
"... Continuous state DEC-MDPs are critical for agent teams in domains involving resources such as time, but scaling them up is a significant challenge. To meet this challenge, we first introduce a novel continuous-time DEC-MDP model that exploits transition independence in domains with temporal constrai ..."
Abstract - Add to MetaCart
Continuous state DEC-MDPs are critical for agent teams in domains involving resources such as time, but scaling them up is a significant challenge. To meet this challenge, we first introduce a novel continuous-time DEC-MDP model that exploits transition independence in domains with temporal constraints. More importantly, we present a new locally optimal algorithm called SPAC. Compared to the best previous algorithm, SPAC finds solutions of comparable quality substantially faster; SPAC also scales to larger teams of agents. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University