Results 1  10
of
18
Decentralized Control of Partially Observable Markov Decision Processes
"... Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or co ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
(Show Context)
Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or corruption). This paper surveys recent work on decentralized control of MDPs in which control of each agent depends on a partial view of the world. We focus on a general framework where there may be uncertainty about the state of the environment, represented as a decentralized partially observable MDP (DecPOMDP), but consider a number of subclasses with different assumptions about uncertainty and agent independence. In these models, a shared objective function is used, but plans of action must be based on a partial view of the environment. We describe the frameworks, along with the complexity of optimal control and important properties. We also provide an overview of exact and approximate solution methods as well as relevant applications. This survey provides an introduction to what has become an active area of research on these models and their solutions. I.
Optimally solving DecPOMDPs as continuousstate MDPs
 in Proceedings of the TwentyThird International Joint Conference on Artificial Intelligence
, 2013
"... Optimally solving decentralized partially observable Markov decision processes (DecPOMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
Optimally solving decentralized partially observable Markov decision processes (DecPOMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning horizon increases, these methods quickly become intractable. However, in real world problems, computing policies over the full history space is often unnecessary. True histories experienced by the agents often lie near a structured, lowdimensional manifold embedded into the history space. We show that by transforming a DecPOMDP into a continuousstate MDP, we are able to find and exploit these lowdimensional representations. Using this novel transformation, we can then apply powerful techniques for solving POMDPs and continuousstate MDPs. By combining a general search algorithm and dimension reduction based on feature selection, we introduce a novel approach to optimally solve problems with significantly longer planning horizons than previous methods. 1
Approximate solutions for factored DecPOMDPs with many agents
 In AAMAS
, 2013
"... DecPOMDPs are a powerful framework for planning in multiagent systems, but are provably intractable to solve. This paper proposes a factored forwardsweep policy computation method that tackles the stages of the problem one by one, exploiting weakly coupled structure at each of these stages. An emp ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
(Show Context)
DecPOMDPs are a powerful framework for planning in multiagent systems, but are provably intractable to solve. This paper proposes a factored forwardsweep policy computation method that tackles the stages of the problem one by one, exploiting weakly coupled structure at each of these stages. An empirical evaluation shows that the loss in solution quality due to these approximations is small and that the proposed method achieves unprecedented scalability, solving DecPOMDPs with hundreds of agents. 1
Sufficient PlanTime Statistics for Decentralized POMDPs
 IJCAI
, 2013
"... Optimal decentralized decision making in a team of cooperative agents as formalized by decentralized POMDPs is a notoriously hard problem. A major obstacle is that the agents do not have access to a sufficient statistic during execution, which means that they need to base their actions on their hist ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Optimal decentralized decision making in a team of cooperative agents as formalized by decentralized POMDPs is a notoriously hard problem. A major obstacle is that the agents do not have access to a sufficient statistic during execution, which means that they need to base their actions on their histories of observations. A consequence is that even during offline planning the choice of decision rules for different stages is tightly interwoven: decisions of earlier stages affect how to act optimally at later stages, and the optimal value function for a stage is known to have a dependence on the decisions made up to that point. This paper makes a contribution to the theory of decentralized POMDPs by showing how this dependence on the ‘past joint policy ’ can be replaced by a sufficient statistic. These results are extended to the case of kstep delayed communication. The paper investigates the practical implications, as well as the effectiveness of a new pruning technique for MAA * methods, in a number of benchmark problems and discusses future avenues of research opened by these contributions. 1
Planning with macroactions in decentralized POMDPs
 In Proceedings of the Thirteenth International Conference on Autonomous Agents and Multiagent Systems
, 2014
"... Decentralized partially observable Markov decision processes (DecPOMDPs) are general models for decentralized decision making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent’s actions are primitive operations lasting exactly one time step. ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
Decentralized partially observable Markov decision processes (DecPOMDPs) are general models for decentralized decision making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent’s actions are primitive operations lasting exactly one time step. We address the case where each agent has macroactions: temporally extended actions which may require different amounts of time to execute. We model macroactions as options in a factored DecPOMDP model, focusing on options which depend only on information available to an individual agent while executing. This enables us to model systems where coordination decisions only occur at the level of deciding which macroactions to execute, and the macroactions themselves can then be executed to completion. The core technical difficulty when using options in a DecPOMDP is that the options chosen by the agents no longer terminate at the same time. We present extensions of two leading DecPOMDP algorithms for generating a policy with options and discuss the resulting form of optimality. Our results show that these algorithms retain agent coordination while allowing nearoptimal solutions to be generated for significantly longer horizons and larger statespaces than previous DecPOMDP methods. 1.
Decentralized stochastic control
 ANN OPER RES
, 2014
"... Decentralized stochastic control refers to the multistage optimization of a dynamical system by multiple controllers that have access to different information. Decentralization of information gives rise to new conceptual challenges that require new solution approaches. In this expository paper, w ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Decentralized stochastic control refers to the multistage optimization of a dynamical system by multiple controllers that have access to different information. Decentralization of information gives rise to new conceptual challenges that require new solution approaches. In this expository paper, we use the notion of an informationstate to explain the two commonly used solution approaches to decentralized control: the personbyperson approach and the commoninformation approach.
Planning for Decentralized Control of Multiple Robots Under Uncertainty
"... We describe a probabilistic framework for synthesizing control policies for general multirobot systems, given environment and sensor models and a cost function. Decentralized, partially observable Markov decision processes (DecPOMDPs) are a general model of decision processes where a team of ag ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
We describe a probabilistic framework for synthesizing control policies for general multirobot systems, given environment and sensor models and a cost function. Decentralized, partially observable Markov decision processes (DecPOMDPs) are a general model of decision processes where a team of agents must cooperate to optimize some objective (specified by a shared reward or cost function) in the presence of uncertainty, but where communication limitations mean that the agents cannot share their state, so execution must proceed in a decentralized fashion. While DecPOMDPs are typically intractable to solve for realworld problems, recent research on the use of macroactions in DecPOMDPs has significantly increased the size of problem that can be practically solved as a DecPOMDP. We describe this general model, and show how, in contrast to most existing methods that are specialized to a particular problem class, it can synthesize control policies that use whatever opportunities for coordination are present in the problem, while balancing off uncertainty in outcomes, sensor information, and information about other agents. We use three variations on a warehouse task to show that a single planner of this type can generate cooperative behavior using task allocation, direct communication, and signaling, as appropriate.
Probabilistic Inference Techniques for Scalable Multiagent Decision Making
, 2015
"... Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models—NEXPComplete even for two agents—has limited their scalability. We present a promising new class of approximation algorithms by developing novel connections betwe ..."
Abstract
 Add to MetaCart
Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models—NEXPComplete even for two agents—has limited their scalability. We present a promising new class of approximation algorithms by developing novel connections between multiagent planning and machine learning. We show how the multiagent planning problem can be reformulated as inference in a mixture of dynamic Bayesian networks (DBNs). This planningasinference approach paves the way for the application of efficient inference techniques in DBNs to multiagent decision making. To further improve scalability, we identify certain conditions that are sufficient to extend the approach to multiagent systems with dozens of agents. Specifically, we show that the necessary inference within the expectationmaximization framework can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We further show that a number of existing multiagent planning models satisfy these conditions. Experiments on large planning benchmarks confirm the benefits of our approach in terms of runtime and scalability with respect to existing techniques.
Proceedings of the TwentyThird International Joint Conference on Artificial Intelligence Optimally Solving DecPOMDPs as ContinuousState MDPs
"... Optimally solving decentralized partially observable Markov decision processes (DecPOMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning ..."
Abstract
 Add to MetaCart
Optimally solving decentralized partially observable Markov decision processes (DecPOMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning horizon increases, these methods quickly become intractable. However, in real world problems, computing policies over the full history space is often unnecessary. True histories experienced by the agents often lie near a structured, lowdimensional manifold embedded into the history space. We show that by transforming a DecPOMDP into a continuousstate MDP, we are able to find and exploit these lowdimensional representations. Using this novel transformation, we can then apply powerful techniques for solving POMDPs and continuousstate MDPs. By combining a general search algorithm and dimension reduction based on feature selection, we introduce a novel approach to optimally solve problems with significantly longer planning horizons than previous methods. 1