Results 1  10
of
40
Game theoretic control for robot teams
 In Proc. of the IEEE International Conference on Robotics and Automation
, 2005
"... Abstract — In the real world, noisy sensors and limited communication make it difficult for robot teams to coordinate in tighty coupled tasks. Team members cannot simply apply singlerobot solution techniques for partially observable problems in parallel because they do not take into account the rec ..."
Abstract

Cited by 44 (1 self)
 Add to MetaCart
(Show Context)
Abstract — In the real world, noisy sensors and limited communication make it difficult for robot teams to coordinate in tighty coupled tasks. Team members cannot simply apply singlerobot solution techniques for partially observable problems in parallel because they do not take into account the recursive effect that reasoning about the beliefs of others has on policy generation. Instead, we must turn to a game theoretic approach to model the problem correctly. Partially observable stochastic games (POSGs) provide a solution model for decentralized robot teams, however, this model quickly becomes intractable. In previous work we presented an algorithm for lookahead search in POSGs. Here we present an extension which reduces computation during lookahead by clustering similar observation histories together. We show that by clustering histories which have similar profiles of predicted reward, we can greatly reduce the computation time required to solve a POSG while maintaining a good approximation to the optimal policy. We demonstrate the power of the clustering algorithm in a realtime robot controller as well as for a simple benchmark problem.
Decentralized planning under uncertainty for teams of communicating agents
 In Proc. AAMAS
, 2006
"... Decentralized partially observable Markov decision processes (DECPOMDPs) form a general framework for planning for groups of cooperating agents that inhabit a stochastic and partially observable environment. Unfortunately, computing optimal plans in a DECPOMDP has been shown to be intractable (NEX ..."
Abstract

Cited by 23 (4 self)
 Add to MetaCart
(Show Context)
Decentralized partially observable Markov decision processes (DECPOMDPs) form a general framework for planning for groups of cooperating agents that inhabit a stochastic and partially observable environment. Unfortunately, computing optimal plans in a DECPOMDP has been shown to be intractable (NEXPcomplete), and approximate algorithms for specific subclasses have been proposed. Many of these algorithms rely on an (approximate) solution of the centralized planning problem (i.e., treating the whole team as a single agent). We take a more decentralized approach, in which each agent only reasons over its own local state and some uncontrollable state features, which are shared by all team members. In contrast to other approaches, we model communication as an integral part of the agent’s reasoning, in which the meaning of a message is directly encoded in the policy of the communicating agent. We explore iterative methods for approximately solving such models, and we conclude with some encouraging preliminary experimental results.
Modeling and Simulating Human Teamwork Behaviors Using Intelligent Agents
 In Journal of Physics of Life Reviews
, 2004
"... Among researchers in multiagent systems there has been growing interest in using intelligent agents to model and simulate human teamwork behaviors. Teamwork modeling is important for training humans in gaining collaborative skills, for supporting humans in making critical decisions by proactively ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
Among researchers in multiagent systems there has been growing interest in using intelligent agents to model and simulate human teamwork behaviors. Teamwork modeling is important for training humans in gaining collaborative skills, for supporting humans in making critical decisions by proactively gathering, fusing, and sharing information, and for building coherent teams with both humans and agents working effectively on intelligenceintensive problems. Teamwork modeling is also challenging because the research has spanned diverse disciplines from business management to cognitive science, human discourse, and distributed artificial intelligence. This article presents an extensive, but not exhaustive, list of work in the field, where the taxonomy is organized along two main dimensions: team social structure and social behaviors. Along the dimension of social structure, we consider agentonly teams and mixed human/agent teams. Along the dimension of social behaviors, we consider collaborative behaviors, communicative behaviors, helping behaviors, and the underpinning of effective teamwork shared mental models. The contribution of this article is that it presents an organizational framework for analyzing a variety of teamwork simulation systems and for further studying simulated teamwork behaviors.
Incremental Clustering and Expansion for Faster Optimal Planning in Decentralized POMDPs
, 2013
"... This article presents the stateoftheart in optimal solution methods for decentralized partially observable Markov decision processes (DecPOMDPs), which are general models for collaborative multiagent planning under uncertainty. Building off the generalized multiagent A * (GMAA*) algorithm, which ..."
Abstract

Cited by 19 (12 self)
 Add to MetaCart
(Show Context)
This article presents the stateoftheart in optimal solution methods for decentralized partially observable Markov decision processes (DecPOMDPs), which are general models for collaborative multiagent planning under uncertainty. Building off the generalized multiagent A * (GMAA*) algorithm, which reduces the problem to a tree of oneshot collaborative Bayesian games (CBGs), we describe several advances that greatly expand the range of DecPOMDPs that can be solved optimally. First, we introduce lossless incremental clustering of the CBGs solved by GMAA*, which achieves exponential speedups without sacrificing optimality. Second, we introduce incremental expansion of nodes in the GMAA * search tree, which avoids the need to expand all children, the number of which is in the worst case doubly exponential in the node’s depth. This is particularly beneficial when little clustering is possible. In addition, we introduce new hybrid heuristic representations that are more compact and thereby enable the solution of larger DecPOMDPs. We provide theoretical guarantees that, when a suitable heuristic is used, both incremental clustering and incremental expansion yield algorithms that are both complete and search equivalent. Finally, we present extensive empirical results demonstrating that GMAA*ICE, an algorithm that synthesizes these advances, can optimally solve DecPOMDPs of unprecedented size.
A Bilinear Programming Approach for Multiagent Planning
"... Multiagent planning and coordination problems are common and known to be computationally hard. We show that a wide range of twoagent problems can be formulated as bilinear programs. We present a successive approximation algorithm that significantly outperforms the coverage set algorithm, which is t ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
(Show Context)
Multiagent planning and coordination problems are common and known to be computationally hard. We show that a wide range of twoagent problems can be formulated as bilinear programs. We present a successive approximation algorithm that significantly outperforms the coverage set algorithm, which is the stateoftheart method for this class of multiagent problems. Because the algorithm is formulated for bilinear programs, it is more general and simpler to implement. The new algorithm can be terminated at any time and–unlike the coverage set algorithm–it facilitates the derivation of a useful online performance bound. It is also much more efficient, on average reducing the computation time of the optimal solution by about four orders of magnitude. Finally, we introduce an automatic dimensionality reduction method that improves the effectiveness of the algorithm, extending its applicability to new domains and providing a new way to analyze a subclass of bilinear programs. 1.
Decentralized Communication Strategies for Coordinated MultiAgent Policies
 MultiRobot Systems: From Swarms to Intelligent Automata, volume IV
, 2005
"... Although the presence of free communication reduces the complexity of multiagent POMDPs to that of singleagent POMDPs, in practice, communication is not free and reducing the frequency of communication is often desirable. We present a novel approach for using centralized "singleagent&quo ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
Although the presence of free communication reduces the complexity of multiagent POMDPs to that of singleagent POMDPs, in practice, communication is not free and reducing the frequency of communication is often desirable. We present a novel approach for using centralized "singleagent" policies in decentralized multiagent systems by maintaining and reasoning over the collection of possible joint beliefs of the team. We describe how communication can be used to integrate local observations into the team belief as needed to improve performance, and show both experimentally and through a detailed example how our approach minimizes communication while improving the performance of distributed execution.
MultiAgent Online Planning with Communication
"... We propose an online algorithm for planning under uncertainty in multiagent settings modeled as DECPOMDPs. The algorithm helps overcome the high computational complexity of solving such problems offline. The key challenge is to produce coordinated behavior using little or no communication. When c ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
We propose an online algorithm for planning under uncertainty in multiagent settings modeled as DECPOMDPs. The algorithm helps overcome the high computational complexity of solving such problems offline. The key challenge is to produce coordinated behavior using little or no communication. When communication is allowed but constrained, the challenge is to produce high value with minimal communication. The algorithm addresses these challenges by communicating only when history inconsistency is detected, allowing communication to be postponed if necessary. Moreover, it bounds the memory usage at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing offline planning algorithms and it outperforms the best online method, producing higher value with much less communication in most cases.
Offline planning for communication by exploiting structured interactions in decentralized mdps
 In Web Intelligence and Intelligent Agent Technologies, 2009. WIIAT’09. IEEE/WIC/ACM International Joint Conferences on
, 2009
"... Variants of the decentralized MDP model focus on problems exhibiting some special structure that makes them easier to solve in practice. Our work is concerned with two main issues. First, we propose a new model, EventDriven Interaction with Complex Rewards, that addresses problems having structured ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
(Show Context)
Variants of the decentralized MDP model focus on problems exhibiting some special structure that makes them easier to solve in practice. Our work is concerned with two main issues. First, we propose a new model, EventDriven Interaction with Complex Rewards, that addresses problems having structured transition and reward dependence. Our model captures a wider range of problems than existing structured models. In spite of its generality, the model still offers structure that can be leveraged by heuristics and solution algorithms. This is facilitated by explicitly representing interactions as firstclass entities. We formulate and solve instances of our model as bilinear programs. Second, we look at making offline planning for communication tractable. To this end, we propose heuristics that limit problem size by making communication available only at a few strategically chosen points based on an analysis that exploits problem structure in the proposed model. Experimental results demonstrate a reduction in problem size and solution time using restricted communication, with little or no decrease in solution quality. Our heuristics therefore allow us to solve problems that would otherwise be intractable. 1
Decentralized Control of Partially Observable Markov Decision Processes
"... Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or co ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
(Show Context)
Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or corruption). This paper surveys recent work on decentralized control of MDPs in which control of each agent depends on a partial view of the world. We focus on a general framework where there may be uncertainty about the state of the environment, represented as a decentralized partially observable MDP (DecPOMDP), but consider a number of subclasses with different assumptions about uncertainty and agent independence. In these models, a shared objective function is used, but plans of action must be based on a partial view of the environment. We describe the frameworks, along with the complexity of optimal control and important properties. We also provide an overview of exact and approximate solution methods as well as relevant applications. This survey provides an introduction to what has become an active area of research on these models and their solutions. I.
Pointbased policy generation for decentralized POMDPs
 In AAMAS
, 2010
"... Memorybounded techniques have shown great promise in solving complex multiagent planning problems modeled as DECPOMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these i ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
Memorybounded techniques have shown great promise in solving complex multiagent planning problems modeled as DECPOMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these improvements, stateoftheart algorithms can still handle a relative small pool of candidate policies, which limits the quality of the solution in some benchmark problems. We present a new algorithm, PointBased Policy Generation, which avoids altogether searching the entire joint policy space. The key observation is that the best joint policy for each reachable belief state can be constructed directly, instead of producing first a large set of candidates. We also provide an efficient approximate implementation of this operation. The experimental results show that our solution technique improves the performance significantly in terms of both runtime and solution quality.