Results 1  10
of
30
Incremental Clustering and Expansion for Faster Optimal Planning in Decentralized POMDPs
, 2013
"... This article presents the stateoftheart in optimal solution methods for decentralized partially observable Markov decision processes (DecPOMDPs), which are general models for collaborative multiagent planning under uncertainty. Building off the generalized multiagent A * (GMAA*) algorithm, which ..."
Abstract

Cited by 18 (12 self)
 Add to MetaCart
(Show Context)
This article presents the stateoftheart in optimal solution methods for decentralized partially observable Markov decision processes (DecPOMDPs), which are general models for collaborative multiagent planning under uncertainty. Building off the generalized multiagent A * (GMAA*) algorithm, which reduces the problem to a tree of oneshot collaborative Bayesian games (CBGs), we describe several advances that greatly expand the range of DecPOMDPs that can be solved optimally. First, we introduce lossless incremental clustering of the CBGs solved by GMAA*, which achieves exponential speedups without sacrificing optimality. Second, we introduce incremental expansion of nodes in the GMAA * search tree, which avoids the need to expand all children, the number of which is in the worst case doubly exponential in the node’s depth. This is particularly beneficial when little clustering is possible. In addition, we introduce new hybrid heuristic representations that are more compact and thereby enable the solution of larger DecPOMDPs. We provide theoretical guarantees that, when a suitable heuristic is used, both incremental clustering and incremental expansion yield algorithms that are both complete and search equivalent. Finally, we present extensive empirical results demonstrating that GMAA*ICE, an algorithm that synthesizes these advances, can optimally solve DecPOMDPs of unprecedented size.
Scalable Multiagent Planning Using Probabilistic Inference
, 2011
"... Multiagent planning has seen much progress with the development of formal models such as DecPOMDPs. However, the complexity of these models—NEXPComplete even for two agents— has limited scalability. We identify certain mild conditions that are sufficient to make multiagent planning amenable to a s ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
Multiagent planning has seen much progress with the development of formal models such as DecPOMDPs. However, the complexity of these models—NEXPComplete even for two agents— has limited scalability. We identify certain mild conditions that are sufficient to make multiagent planning amenable to a scalable approximation w.r.t. the number of agents. This is achieved by constructing a graphical model in which likelihood maximization is equivalent to plan optimization. Using the ExpectationMaximization framework for likelihood maximization, we show that the necessary inference can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We derive a global update rule that combines these local inferences to monotonically increase the overall solution quality. Experiments on a large multiagent planning benchmark confirm the benefits of the new approach in terms of runtime and scalability.
Anytime planning for decentralized POMDPs using expectation maximization
 IN UAI
, 2010
"... Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. While finitehorizon DECPOMDPs have enjoyed significant success, progress remains slow for the infinitehorizon case mainly due to the inherent complexity of optimizing stochastic controllers representi ..."
Abstract

Cited by 15 (7 self)
 Add to MetaCart
Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. While finitehorizon DECPOMDPs have enjoyed significant success, progress remains slow for the infinitehorizon case mainly due to the inherent complexity of optimizing stochastic controllers representing agent policies. We present a promising new class of algorithms for the infinitehorizon case, which recasts the optimization problem as inference in a mixture of DBNs. An attractive feature of this approach is the straightforward adoption of existing inference techniques in DBNs for solving DECPOMDPs and supporting richer representations such as factored or continuous states and actions. We also derive the Expectation Maximization (EM) algorithm to optimize the joint policy represented as DBNs. Experiments on benchmark domains show that EM compares favorably against the stateoftheart solvers.
Decentralized Control of Partially Observable Markov Decision Processes
"... Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or co ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or corruption). This paper surveys recent work on decentralized control of MDPs in which control of each agent depends on a partial view of the world. We focus on a general framework where there may be uncertainty about the state of the environment, represented as a decentralized partially observable MDP (DecPOMDP), but consider a number of subclasses with different assumptions about uncertainty and agent independence. In these models, a shared objective function is used, but plans of action must be based on a partial view of the environment. We describe the frameworks, along with the complexity of optimal control and important properties. We also provide an overview of exact and approximate solution methods as well as relevant applications. This survey provides an introduction to what has become an active area of research on these models and their solutions. I.
FiniteState Controllers Based on Mealy Machines for Centralized and Decentralized POMDPs
 PROCEEDINGS OF THE TWENTYFOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI10)
, 2010
"... Existing controllerbased approaches for centralized and decentralized POMDPs are based on automata with output known as Moore machines. In this paper, we show that several advantages can be gained by utilizing another type of automata, the Mealy machine. Mealy machines are more powerful than Moore ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
Existing controllerbased approaches for centralized and decentralized POMDPs are based on automata with output known as Moore machines. In this paper, we show that several advantages can be gained by utilizing another type of automata, the Mealy machine. Mealy machines are more powerful than Moore machines, provide a richer structure that can be exploited by solution methods, and can be easily incorporated into current controllerbased approaches. To demonstrate this, we adapted some existing controllerbased algorithms to use Mealy machines and obtained results on a set of benchmark domains. The Mealybased approach always outperformed the Moorebased approach and often outperformed the stateoftheart algorithms for both centralized and decentralized POMDPs. These findings provide fresh and general insights for the improvement of existing algorithms and the development of new ones.
Pointbased policy generation for decentralized POMDPs
 In AAMAS
, 2010
"... Memorybounded techniques have shown great promise in solving complex multiagent planning problems modeled as DECPOMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these i ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
(Show Context)
Memorybounded techniques have shown great promise in solving complex multiagent planning problems modeled as DECPOMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these improvements, stateoftheart algorithms can still handle a relative small pool of candidate policies, which limits the quality of the solution in some benchmark problems. We present a new algorithm, PointBased Policy Generation, which avoids altogether searching the entire joint policy space. The key observation is that the best joint policy for each reachable belief state can be constructed directly, instead of producing first a large set of candidates. We also provide an efficient approximate implementation of this operation. The experimental results show that our solution technique improves the performance significantly in terms of both runtime and solution quality.
Producing efficient errorbounded solutions for transition independent decentralized MDPs
 in Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems
, 2013
"... There has been substantial progress on algorithms for singleagent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: errorbounds and fast convergence rates. Despite significan ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
(Show Context)
There has been substantial progress on algorithms for singleagent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: errorbounds and fast convergence rates. Despite significant efforts, no algorithms for solving decentralized POMDPs benefit from these properties, leading to either poor solution quality or limited scalability. This paper presents the first approach for solving transition independent decentralized Markov decision processes (DecMDPs), that inherits these properties. Two related algorithms illustrate this approach. The first recasts the original problem as a deterministic and completely observable Markov decision process. In this form, the original problem is solved by combining heuristic search with constraint optimization to quickly converge into a nearoptimal policy. This algorithm also provides the foundation for the first algorithm for solving infinitehorizon transition independent decentralized MDPs. We demonstrate that both methods outperform stateoftheart algorithms by multiple orders of magnitude, and for infinitehorizon decentralized MDPs, the algorithm is able to construct more concise policies by searching cyclic policy graphs.
Solving Decision Problems with Limited Information
"... We present a new algorithm for exactly solving decisionmaking problems represented as an influence diagram. We do not require the usual assumptions of no forgetting and regularity, which allows us to solve problems with limited information. The algorithm, which implements a sophisticated variable e ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
We present a new algorithm for exactly solving decisionmaking problems represented as an influence diagram. We do not require the usual assumptions of no forgetting and regularity, which allows us to solve problems with limited information. The algorithm, which implements a sophisticated variable elimination procedure, is empirically shown to outperform a stateoftheart algorithm in randomly generated problems of up to 150 variables and 10 64 strategies. 1
The Complexity of Approximately Solving Influence Diagrams
"... Influence diagrams allow for intuitive and yet precise description of complex situations involving decision making under uncertainty. Unfortunately, most of the problems described by influence diagrams are hard to solve. In this paper we discuss the complexity of approximately solving influence diag ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
(Show Context)
Influence diagrams allow for intuitive and yet precise description of complex situations involving decision making under uncertainty. Unfortunately, most of the problems described by influence diagrams are hard to solve. In this paper we discuss the complexity of approximately solving influence diagrams. We do not assume noforgetting or regularity, which makes the class of problems we address very broad. Remarkably, we show that when both the treewidth and the cardinality of the variables are bounded the problem admits a fully polynomialtime approximation scheme. 1
Isomorphfree Branch and Bound Search for Finite State Controllers
"... The recent proliferation of smartphones and other wearable devices has lead to a surge of new mobile applications. Partially observable Markov decision processes provide a natural framework to design applications that continuously make decisions based on noisy sensor measurements. However, given th ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
The recent proliferation of smartphones and other wearable devices has lead to a surge of new mobile applications. Partially observable Markov decision processes provide a natural framework to design applications that continuously make decisions based on noisy sensor measurements. However, given the limited battery life, there is a need to minimize the amount of online computation. This can be achieved by compiling a policy into a finite state controller since there is no need for belief monitoring or online search. In this paper, we propose a new branch and bound technique to search for a good controller. In contrast to many existing algorithms for controllers, our search technique is not subject to local optima. We also show how to reduce the amount of search by avoiding the enumeration of isomorphic controllers and by taking advantage of suitable upper and lower bounds. The approach is demonstrated on several benchmark problems as well as a smartphone application to assist persons with Alzheimer’s to wayfind. 1