Results 1 
4 of
4
Stochastic dominance in stochastic DCOPs for risksensitive applications
 In Proceedings of AAMAS
, 2012
"... Distributed constraint optimization problems (DCOPs) are wellsuited for modeling multiagent coordination problems where the primary interactions are between local subsets of agents. However, one limitation of DCOPs is the assumption that the constraint rewards are without uncertainty. Researchers ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
Distributed constraint optimization problems (DCOPs) are wellsuited for modeling multiagent coordination problems where the primary interactions are between local subsets of agents. However, one limitation of DCOPs is the assumption that the constraint rewards are without uncertainty. Researchers have thus extended DCOPs to Stochastic DCOPs (SDCOPs), where rewards are sampled from known probability distribution reward functions, and introduced algorithms to find solutions with the largest expected reward. Unfortunately, such a solution might be very risky, that is, very likely to result in a poor reward. Thus, in this paper, we make three contributions: (1) we propose a stricter objective for SDCOPs, namely to find a solution with the most stochastically dominating probability distribution reward function; (2) we introduce an algorithm to find such solutions; and (3) we show that stochastically dominating solutions can indeed be less risky than expected reward maximizing solutions.
Decentralized multiagent reinforcement learning in averagereward dynamic DCOPs (theoretical proofs
, 2014
"... Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multiagent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Exist ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multiagent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MDDCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Qlearning algorithm and the Distributed Rlearning algorithm, that balance exploration and exploitation to solve MDDCOPs in an online manner; and (iii) We empirically evaluate them against an existing multiarm bandit DCOP algorithm on dynamic DCOPs.
RegretBased MultiAgent Coordination with Uncertain Task Rewards
"... Many multiagent coordination problems can be represented as DCOPs. Motivated by task allocation in disaster response, we extend standard DCOP models to consider uncertain task rewards where the outcome of completing a task depends on its current state, which is randomly drawn from unknown distribu ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Many multiagent coordination problems can be represented as DCOPs. Motivated by task allocation in disaster response, we extend standard DCOP models to consider uncertain task rewards where the outcome of completing a task depends on its current state, which is randomly drawn from unknown distributions. The goal of solving this problem is to find a solution for all agents that minimizes the overall worstcase loss. This is a challenging problem for centralized algorithms because the search space grows exponentially with the number of agents and is nontrivial for existing algorithms for standard DCOPs. To address this, we propose a novel decentralized algorithm that incorporates MaxSum with iterative constraint generation to solve the problem by passing messages among agents. By so doing, our approach scales well and can solve instances of the task allocation problem with hundreds of agents and tasks.
Decentralized MultiAgent Reinforcement Learning in AverageReward Dynamic DCOPs
"... Abstract. Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multiagent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multiagent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MDDCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Qlearning algorithm and the Distributed Rlearning algorithm, that balance exploration and exploitation to solve MDDCOPs in an online manner; and (iii) We empirically evaluate them against an existing multiarm bandit DCOP algorithm on dynamic DCOPs. 1