Results 1  10
of
231
Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings
 IN IJCAI
, 2003
"... The problem of deriving joint policies for a group of agents that maximize some joint reward function can be modeled as a decentralized partially observable Markov decision process (POMDP). Yet, despite ..."
Abstract

Cited by 191 (26 self)
 Add to MetaCart
The problem of deriving joint policies for a group of agents that maximize some joint reward function can be modeled as a decentralized partially observable Markov decision process (POMDP). Yet, despite
Optimizing Information Exchange in Cooperative Multiagent Systems
, 2003
"... Decentralized control of a cooperative multiagent system is the problem faced by multiple decisionmakers that share a common set of objectives. The decisionmakers may be robots placed at separate geographical locations or computational processes distributed in an information space. It may be impo ..."
Abstract

Cited by 107 (18 self)
 Add to MetaCart
Decentralized control of a cooperative multiagent system is the problem faced by multiple decisionmakers that share a common set of objectives. The decisionmakers may be robots placed at separate geographical locations or computational processes distributed in an information space. It may be impossible or undesirable for these decisionmakers to share all their knowledge all the time. Furthermore, exchanging information may incur a cost associated with the required bandwidth or with the risk of revealing it to competing agents. Assuming that communication may not be reliable adds another dimension of complexity to the problem. This paper develops a decisiontheoretic solution to this problem, treating both standard actions and communication as explicit choices that the decision maker must consider. The goal is to derive both action policies and communication policies that together optimize a global value function. We present an analytical model to evaluate the tradeo# between the cost of communication and the value of the information received. Finally, to address the complexity of this hard optimization problem, we develop a practical approximation technique based on myopic metalevel control of communication.
Solving transition independent decentralized Markov decision processes
 JAIR
, 2004
"... Formal treatment of collaborative multiagent systems has been lagging behind the rapid progress in sequential decision making by individual agents. Recent work in the area of decentralized Markov Decision Processes (MDPs) has contributed to closing this gap, but the computational complexity of thes ..."
Abstract

Cited by 107 (13 self)
 Add to MetaCart
(Show Context)
Formal treatment of collaborative multiagent systems has been lagging behind the rapid progress in sequential decision making by individual agents. Recent work in the area of decentralized Markov Decision Processes (MDPs) has contributed to closing this gap, but the computational complexity of these models remains a serious obstacle. To overcome this complexity barrier, we identify a specific class of decentralized MDPs in which the agents ’ transitions are independent. The class consists of independent collaborating agents that are tied together through a structured global reward function that depends on all of their histories of states and actions. We present a novel algorithm for solving this class of problems and examine its properties, both as an optimal algorithm and as an anytime algorithm. To the best of our knowledge, this is the first algorithm to optimally solve a nontrivial subclass of decentralized MDPs. It lays the foundation for further work in this area on both exact and approximate algorithms. 1.
A Survey of MultiAgent Organizational Paradigms
 The Knowledge Engineering Review
, 2005
"... Many researchers have demonstrated that the organizational design employed by a system can have a significant, quantitative effect on its performance characteristics. A range of organizational strategies have emerged from this line of research, each with different strengths and weaknesses. In this a ..."
Abstract

Cited by 103 (2 self)
 Add to MetaCart
(Show Context)
Many researchers have demonstrated that the organizational design employed by a system can have a significant, quantitative effect on its performance characteristics. A range of organizational strategies have emerged from this line of research, each with different strengths and weaknesses. In this article we present a survey of the major organizational paradigms used in multiagent systems. These include hierarchies, holarchies, coalitions, teams, congregations, societies, federations, and matrix organizations. We will provide a description of each, discuss their costs and benefits, and provide examples of how they may be instantiated and maintained. 1
Improved memorybounded dynamic programming for decentralized POMDPs
 In Proceedings of the TwentyThird Conference on Uncertainty in Artificial Intelligence
, 2007
"... Decentralized decision making under uncertainty has been shown to be intractable when each agent has different partial information about the domain. Thus, improving the applicability and scalability of planning algorithms is an important challenge. We present the first memorybounded dynamic program ..."
Abstract

Cited by 94 (22 self)
 Add to MetaCart
Decentralized decision making under uncertainty has been shown to be intractable when each agent has different partial information about the domain. Thus, improving the applicability and scalability of planning algorithms is an important challenge. We present the first memorybounded dynamic programming algorithm for finitehorizon decentralized POMDPs. A set of heuristics is used to identify relevant points of the infinitely large belief space. Using these belief points, the algorithm successively selects the best joint policies for each horizon. The algorithm is extremely efficient, having linear time and space complexity with respect to the horizon length. Experimental results show that it can handle horizons that are multiple orders of magnitude larger than what was previously possible, while achieving the same or better solution quality. These results significantly increase the applicability of decentralized decisionmaking techniques. 1
Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs
 In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems
, 2004
"... Partially observable decentralized decision making in robot teams is fundamentally different from decision making in fully observable problems. Team members cannot simply apply singleagent solution techniques in parallel. Instead, we must turn to game theoretic frameworks to correctly model the pro ..."
Abstract

Cited by 92 (2 self)
 Add to MetaCart
Partially observable decentralized decision making in robot teams is fundamentally different from decision making in fully observable problems. Team members cannot simply apply singleagent solution techniques in parallel. Instead, we must turn to game theoretic frameworks to correctly model the problem. While partially observable stochastic games (POSGs) provide a solution model for decentralized robot teams, this model quickly becomes intractable. We propose an algorithm that approximates POSGs as a series of smaller, related Bayesian games, using heuristics such as QMDP to provide the future discounted value of actions. This algorithm trades off limited lookahead in uncertainty for computational feasibility, and results in policies that are locally optimal with respect to the selected heuristic. Empirical results are provided for both a simple problem for which the full POSG can also be constructed, as well as more complex, robotinspired, problems.
MAA*: A heuristic search algorithm for solving decentralized POMDPs
 In Proceedings of the TwentyFirst Conference on Uncertainty in Artificial Intelligence
, 2005
"... We present multiagent A * (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partiallyobservable Markov decision problems (DECPOMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate i ..."
Abstract

Cited by 91 (21 self)
 Add to MetaCart
We present multiagent A * (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partiallyobservable Markov decision problems (DECPOMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate in a stochastic environment such as multirobot coordination, network traffic control, or distributed resource allocation. Solving such problems effectively is a major challenge in the area of planning under uncertainty. Our solution is based on a synthesis of classical heuristic search and decentralized control theory. Experimental results show that MAA * has significant advantages. We introduce an anytime variant of MAA * and conclude with a discussion of promising extensions such as an approach to solving infinite horizon problems. 1
Decentralized control of cooperative systems: Categorization and complexity analysis
 Journal of Artificial Intelligence Research
, 2004
"... Decentralized control of cooperative systems captures the operation of a group of decisionmakers that share a single global objective. The difficulty in solving optimally such problems arises when the agents lack full observability of the global state of the system when they operate. The general pr ..."
Abstract

Cited by 89 (9 self)
 Add to MetaCart
(Show Context)
Decentralized control of cooperative systems captures the operation of a group of decisionmakers that share a single global objective. The difficulty in solving optimally such problems arises when the agents lack full observability of the global state of the system when they operate. The general problem has been shown to be NEXPcomplete. In this paper, we identify classes of decentralized control problems whose complexity ranges between NEXP and P. In particular, we study problems characterized by independent transitions, independent observations, and goaloriented objective functions. Two algorithms are shown to solve optimally useful classes of goaloriented decentralized processes in polynomial time. This paper also studies information sharing among the decisionmakers, which can improve their performance. We distinguish between three ways in which agents can exchange information: indirect communication, direct communication and sharing state features that are not controlled by the agents. Our analysis shows that for every class of problems we consider, introducing direct or indirect communication does not change the worstcase complexity. The results provide a better understanding of the complexity of decentralized control problems that arise in practice and facilitate the development of planning algorithms for these problems. 1.
TransitionIndependent Decentralized Markov Decision Processes
, 2003
"... There has been substantial progress with formal models for sequential decision making by individual agents using the Markov decision process (MDP). However, similar treatment of multiagent systems is lacking. A recent complexity result, showing that solving decentralized MDPs is NEXPhard, provides ..."
Abstract

Cited by 78 (15 self)
 Add to MetaCart
There has been substantial progress with formal models for sequential decision making by individual agents using the Markov decision process (MDP). However, similar treatment of multiagent systems is lacking. A recent complexity result, showing that solving decentralized MDPs is NEXPhard, provides a partial explanation. To overcome this complexity barrier, we identify a general class of transitionindependent decentralized MDPs that is widely applicable. The class consists of independent collaborating agents that are tied together through a global reward function that depends upon both of their histories. We present a novel algorithm for solving this class of problems and examine its properties. The result is the first effective technique to solve optimally a class of decentralized MDPs. This lays the foundation for further work in this area on both exact and approximate solutions.