Results 1 - 10
of
18
Decentralized Control of Partially Observable Markov Decision Processes
"... Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or co ..."
Abstract
-
Cited by 14 (8 self)
- Add to MetaCart
(Show Context)
Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or corruption). This paper surveys recent work on decentralized control of MDPs in which control of each agent depends on a partial view of the world. We focus on a general framework where there may be uncertainty about the state of the environment, represented as a decentralized partially observable MDP (Dec-POMDP), but consider a number of subclasses with different assumptions about uncertainty and agent independence. In these models, a shared objective function is used, but plans of action must be based on a partial view of the environment. We describe the frameworks, along with the complexity of optimal control and important properties. We also provide an overview of exact and approximate solution methods as well as relevant applications. This survey provides an introduction to what has become an active area of research on these models and their solutions. I.
Optimally solving Dec-POMDPs as continuous-state MDPs
- in Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence
, 2013
"... Optimally solving decentralized partially observable Markov decision processes (Dec-POMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Optimally solving decentralized partially observable Markov decision processes (Dec-POMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning horizon increases, these methods quickly become intractable. However, in real world problems, computing policies over the full history space is often unnecessary. True histories experienced by the agents often lie near a structured, low-dimensional manifold embedded into the history space. We show that by transforming a Dec-POMDP into a continuous-state MDP, we are able to find and exploit these low-dimensional representations. Using this novel transformation, we can then apply powerful techniques for solving POMDPs and continuous-state MDPs. By combining a general search algorithm and dimension reduction based on feature selection, we introduce a novel approach to optimally solve problems with significantly longer planning horizons than previous methods. 1
Approximate solutions for factored Dec-POMDPs with many agents
- In AAMAS
, 2013
"... Dec-POMDPs are a powerful framework for planning in multiagent systems, but are provably intractable to solve. This paper proposes a factored forward-sweep policy computation method that tackles the stages of the problem one by one, exploiting weakly coupled structure at each of these stages. An emp ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
(Show Context)
Dec-POMDPs are a powerful framework for planning in multiagent systems, but are provably intractable to solve. This paper proposes a factored forward-sweep policy computation method that tackles the stages of the problem one by one, exploiting weakly coupled structure at each of these stages. An empirical evaluation shows that the loss in solution quality due to these approximations is small and that the proposed method achieves unprecedented scalability, solving Dec-POMDPs with hundreds of agents. 1
Sufficient Plan-Time Statistics for Decentralized POMDPs
- IJCAI
, 2013
"... Optimal decentralized decision making in a team of cooperative agents as formalized by decentralized POMDPs is a notoriously hard problem. A major obstacle is that the agents do not have access to a sufficient statistic during execution, which means that they need to base their actions on their hist ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Optimal decentralized decision making in a team of cooperative agents as formalized by decentralized POMDPs is a notoriously hard problem. A major obstacle is that the agents do not have access to a sufficient statistic during execution, which means that they need to base their actions on their histories of observations. A consequence is that even during off-line planning the choice of decision rules for different stages is tightly interwoven: decisions of earlier stages affect how to act optimally at later stages, and the optimal value function for a stage is known to have a dependence on the decisions made up to that point. This paper makes a contribution to the theory of decentralized POMDPs by showing how this dependence on the ‘past joint policy ’ can be replaced by a sufficient statistic. These results are extended to the case of k-step delayed communication. The paper investigates the practical implications, as well as the effectiveness of a new pruning technique for MAA * methods, in a number of benchmark problems and discusses future avenues of research opened by these contributions. 1
Planning with macro-actions in decentralized POMDPs
- In Proceedings of the Thirteenth International Conference on Autonomous Agents and Multiagent Systems
, 2014
"... Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for decentralized deci-sion making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent’s actions are primitive operations lasting exactly one time step. ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
(Show Context)
Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for decentralized deci-sion making under uncertainty. However, they typically model a problem at a low level of granularity, where each agent’s actions are primitive operations lasting exactly one time step. We address the case where each agent has macro-actions: temporally extended actions which may require different amounts of time to execute. We model macro-actions as options in a factored Dec-POMDP model, focus-ing on options which depend only on information available to an individual agent while executing. This enables us to model systems where coordination decisions only occur at the level of deciding which macro-actions to execute, and the macro-actions themselves can then be executed to com-pletion. The core technical difficulty when using options in a Dec-POMDP is that the options chosen by the agents no longer terminate at the same time. We present extensions of two leading Dec-POMDP algorithms for generating a policy with options and discuss the resulting form of optimality. Our results show that these algorithms retain agent coordi-nation while allowing near-optimal solutions to be generated for significantly longer horizons and larger state-spaces than previous Dec-POMDP methods. 1.
Decentralized stochastic control
- ANN OPER RES
, 2014
"... Decentralized stochastic control refers to the multi-stage optimization of a dynamical system by multiple controllers that have access to different information. Decentralization of information gives rise to new conceptual challenges that require new solution approaches. In this expository paper, w ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Decentralized stochastic control refers to the multi-stage optimization of a dynamical system by multiple controllers that have access to different information. Decentralization of information gives rise to new conceptual challenges that require new solution approaches. In this expository paper, we use the notion of an information-state to explain the two commonly used solution approaches to decentralized control: the person-by-person approach and the common-information approach.
Planning for Decentralized Control of Multiple Robots Under Uncertainty
"... We describe a probabilistic framework for synthesizing con-trol policies for general multi-robot systems, given environ-ment and sensor models and a cost function. Decentral-ized, partially observable Markov decision processes (Dec-POMDPs) are a general model of decision processes where a team of ag ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
We describe a probabilistic framework for synthesizing con-trol policies for general multi-robot systems, given environ-ment and sensor models and a cost function. Decentral-ized, partially observable Markov decision processes (Dec-POMDPs) are a general model of decision processes where a team of agents must cooperate to optimize some objective (specified by a shared reward or cost function) in the presence of uncertainty, but where communication limitations mean that the agents cannot share their state, so execution must proceed in a decentralized fashion. While Dec-POMDPs are typically intractable to solve for real-world problems, recent research on the use of macro-actions in Dec-POMDPs has significantly increased the size of problem that can be prac-tically solved as a Dec-POMDP. We describe this general model, and show how, in contrast to most existing methods that are specialized to a particular problem class, it can syn-thesize control policies that use whatever opportunities for coordination are present in the problem, while balancing off uncertainty in outcomes, sensor information, and information about other agents. We use three variations on a warehouse task to show that a single planner of this type can generate cooperative behavior using task allocation, direct communi-cation, and signaling, as appropriate.
Probabilistic Inference Techniques for Scalable Multiagent Decision Making
, 2015
"... Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models—NEXP-Complete even for two agents—has limited their scalability. We present a promising new class of approxima-tion algorithms by developing novel connections betwe ..."
Abstract
- Add to MetaCart
Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models—NEXP-Complete even for two agents—has limited their scalability. We present a promising new class of approxima-tion algorithms by developing novel connections between multiagent planning and machine learning. We show how the multiagent planning problem can be reformulated as inference in a mixture of dynamic Bayesian networks (DBNs). This planning-as-inference approach paves the way for the application of efficient inference techniques in DBNs to multiagent decision making. To further improve scalability, we identify certain conditions that are sufficient to extend the approach to multiagent systems with dozens of agents. Specifically, we show that the necessary inference within the expectation-maximization framework can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We further show that a number of existing multiagent planning models satisfy these conditions. Experiments on large planning benchmarks confirm the benefits of our approach in terms of runtime and scalability with respect to existing techniques.
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Optimally Solving Dec-POMDPs as Continuous-State MDPs
"... Optimally solving decentralized partially observable Markov decision processes (Dec-POMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning ..."
Abstract
- Add to MetaCart
Optimally solving decentralized partially observable Markov decision processes (Dec-POMDPs) is a hard combinatorial problem. Current algorithms search through the space of full histories for each agent. Because of the doubly exponential growth in the number of policies in this space as the planning horizon increases, these methods quickly become intractable. However, in real world problems, computing policies over the full history space is often unnecessary. True histories experienced by the agents often lie near a structured, low-dimensional manifold embedded into the history space. We show that by transforming a Dec-POMDP into a continuous-state MDP, we are able to find and exploit these low-dimensional representations. Using this novel transformation, we can then apply powerful techniques for solving POMDPs and continuous-state MDPs. By combining a general search algorithm and dimension reduction based on feature selection, we introduce a novel approach to optimally solve problems with significantly longer planning horizons than previous methods. 1