Results 1 - 10
of
13
Integrating Organizational Control into Multi-Agent Learning
"... Multi-Agent Reinforcement Learning (MARL) algorithms suffer from slow convergence and even divergence, especially in largescale systems. In this work, we develop an organization-based control framework to speed up the convergence of MARL algorithms in a network of agents. Our framework defines a mul ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
(Show Context)
Multi-Agent Reinforcement Learning (MARL) algorithms suffer from slow convergence and even divergence, especially in largescale systems. In this work, we develop an organization-based control framework to speed up the convergence of MARL algorithms in a network of agents. Our framework defines a multi-level organizational structure for automated supervision and a communication protocol for exchanging information between lower-level agents and higher-level supervising agents. The abstracted states of lower-level agents travel upwards so that higher-level supervising agents generate a broader view of the state of the network. This broader view is used in creating supervisory information which is passed down the hierarchy. The supervisory policy adaptation then integrates supervisory information into existing MARL algorithms, guiding agents ’ exploration of their state-action space. The generality of our framework is verified by its applications on different domains (distributed task allocation and network routing) with different MARL algorithms. Experimental results show that our framework improves both the speed and likelihood of MARL convergence.
A general framework for interacting Bayes-optimally with self-interested agents using arbitrary parametric model and model prior. arXiv:1304.2024
, 2013
"... Recent advances in Bayesian reinforcement learn-ing (BRL) have shown that Bayes-optimality is theoretically achievable by modeling the envi-ronment’s latent dynamics using Flat-Dirichlet-Multinomial (FDM) prior. In self-interested multi-agent environments, the transition dynamics are mainly controll ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Recent advances in Bayesian reinforcement learn-ing (BRL) have shown that Bayes-optimality is theoretically achievable by modeling the envi-ronment’s latent dynamics using Flat-Dirichlet-Multinomial (FDM) prior. In self-interested multi-agent environments, the transition dynamics are mainly controlled by the other agent’s stochastic behavior for which FDM’s independence and mod-eling assumptions do not hold. As a result, FDM does not allow the other agent’s behavior to be generalized across different states nor specified us-ing prior domain knowledge. To overcome these practical limitations of FDM, we propose a gener-alization of BRL to integrate the general class of parametric models and model priors, thus allowing practitioners ’ domain knowledge to be exploited to produce a fine-grained and compact representation of the other agent’s behavior. Empirical evalua-tion shows that our approach outperforms existing multi-agent reinforcement learning algorithms. 1
Improving Reinforcement Learning by using Case Based Heuristics
"... Abstract. This work presents a new approach that allows the use of cases in a case base as heuristics to speed up Reinforcement Learning algorithms, combining Case Based Reasoning (CBR) and Reinforcement Learning (RL) techniques. This approach, called Case Based Heuristically Accelerated Reinforceme ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. This work presents a new approach that allows the use of cases in a case base as heuristics to speed up Reinforcement Learning algorithms, combining Case Based Reasoning (CBR) and Reinforcement Learning (RL) techniques. This approach, called Case Based Heuristically Accelerated Reinforcement Learning (CB-HARL), builds upon an emerging technique, the Heuristic Accelerated Reinforcement Learning (HARL), in which RL methods are accelerated by making use of heuristic information. CB-HARL is a subset of RL that makes use of a heuristic function derived from a case base, in a Case Based Reasoning manner. An algorithm that incorporates CBR techniques into the Heuristically Accelerated Q–Learning is also proposed. Empirical evaluations were conducted in a simulator for the RoboCup Four-Legged Soccer Competition, and results obtained shows that using CB-HARL, the agents learn faster than using either RL or HARL methods. 1
Transparent Modelling of Finite Stochastic Processes for Multiple Agents
, 2008
"... Abstract. Stochastic Processes are ubiquitous, from automated engineering, through financial markets, to space exploration. These systems are typically highly dynamic, unpredictable and resistant to analytic methods; coupled with a need to orchestrate long control sequences which are both highly com ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
Abstract. Stochastic Processes are ubiquitous, from automated engineering, through financial markets, to space exploration. These systems are typically highly dynamic, unpredictable and resistant to analytic methods; coupled with a need to orchestrate long control sequences which are both highly complex and uncertain. This report examines some existing single- and multi-agent modelling frameworks, details their strengths and weaknesses, and uses the experience to identify some fundamental tenets of good practice in modelling stochastic processes. It goes on to develop a new family of frameworks based on these tenets, which can model single- and multi-agent domains with equal clarity and flexibility, while remaining close enough to the existing frameworks that existing analytic and learning tools can be applied with little or no adaption. Some simple and larger examples illustrate the similarities and differences of this approach, and a discussion of the challenges inherent in developing more flexible tools to exploit these new frameworks concludes matters. 1
Coordination guided reinforcement learning.
- In AAMAS,
, 2012
"... ABSTRACT In this paper, we propose to guide reinforcement learning (RL) with expert coordination knowledge for multi-agent problems managed by a central controller. The aim is to learn to use expert coordination knowledge to restrict the joint action space and to direct exploration towards more pro ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
ABSTRACT In this paper, we propose to guide reinforcement learning (RL) with expert coordination knowledge for multi-agent problems managed by a central controller. The aim is to learn to use expert coordination knowledge to restrict the joint action space and to direct exploration towards more promising states, thereby improving the overall learning rate. We model such coordination knowledge as constraints and propose a two-level RL system that utilizes these constraints for online applications. Our declarative approach towards specifying coordination in multi-agent learning allows knowledge sharing between constraints and features (basis functions) for function approximation. Results on a soccer game and a tactical real-time strategy game show that coordination constraints improve the learning rate compared to using only unary constraints. The two-level RL system also outperforms existing single-level approach that utilizes joint action selection via coordination graphs.
Efficient Multi-Agent Reinforcement Learning through Automated Supervision (Short Paper)
"... Multi-Agent Reinforcement Learning (MARL) algorithms suffer from slow convergence and even divergence, especially in large-scale systems. In this work, we develop a supervision framework to speed up the convergence of MARL algorithms in a network of agents. The framework defines an organizational st ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Multi-Agent Reinforcement Learning (MARL) algorithms suffer from slow convergence and even divergence, especially in large-scale systems. In this work, we develop a supervision framework to speed up the convergence of MARL algorithms in a network of agents. The framework defines an organizational structure for automated supervision and a communication protocol for exchanging information between lower-level agents and higher-level supervising agents. The abstracted states of lower-level agents travel upwards so that higher-level supervising agents generate a broader view of the state of the network. This broader view is used in creating supervisory information which is passed down the hierarchy. We present a generic extension to MARL algorithms that integrates supervisory information into the learning process, guiding agents ’ exploration of their stateaction space.
Learning to Act Stochastically
"... This thesis examines reinforcement learning for stochastic control processes with single and multiple agents, where either the learning outcomes are stochastic policies or learning is perpetual and within the domain of stochastic policies. In this context, a policy is a strategy for processing envir ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This thesis examines reinforcement learning for stochastic control processes with single and multiple agents, where either the learning outcomes are stochastic policies or learning is perpetual and within the domain of stochastic policies. In this context, a policy is a strategy for processing environmental outputs (called observations) and subsequently generating a response or input-signal to the environment (called actions). A stochastic policy gives a probability distribution over actions for each observed situation, and the thesis concentrates on finite sets of observations and actions. There is an exclusive focus on stochastic policies for two principle reasons: such policies have been relatively neglected in the existing literature, and they have been recognised to be especially important in the field of multi-agent reinforcement learning. For the latter reason, the thesis concerns itself primarily with solutions best suited to multi-agent domains. This restriction proves essential, since the topic is otherwise too broad to be covered in depth without losing some clarity and focus. The thesis is partitioned into 3 parts, with chapter of contextual information preceding the first part. Part 1, focuses on analytic and formal mathematical approaches
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with Self-Interested Agents
"... A key challenge in non-cooperative multi-agent systems is that of developing efficient planning algorithms for intelligent agents to interact and perform effectively among boundedly rational, selfinterested agents (e.g., humans). The practicality of existing works addressing this challenge is being ..."
Abstract
- Add to MetaCart
A key challenge in non-cooperative multi-agent systems is that of developing efficient planning algorithms for intelligent agents to interact and perform effectively among boundedly rational, selfinterested agents (e.g., humans). The practicality of existing works addressing this challenge is being undermined due to either the restrictive assumptions of the other agents ’ behavior, the failure in accounting for their rationality, or the prohibitively expensive cost of modeling and predicting their intentions. To boost the practicality of research in this field, we investigate how intention prediction can be efficiently exploited and made practical in planning, thereby leading to efficient intention-aware planning frameworks capable of predicting the intentions of other agents and acting optimally with respect to their predicted intentions. We show that the performance losses incurred by the resulting planning policies are linearly bounded by the error of intention prediction. Empirical evaluations through a series of stochastic games demonstrate that our policies can achieve better and more robust performance than the state-of-the-art algorithms. 1
The Use of Cases as Heuristics to speed up Multiagent Reinforcement Learning
"... This work presents a new approach that allows the use of cases in a case base as heuristics to speed up Multiagent Reinforcement Learning algorithms, combining Case Based Reasoning (CBR) and Multiagent Reinforcement Learning (MRL) techniques. This approach, called Case Based Heuristically Accelera ..."
Abstract
- Add to MetaCart
This work presents a new approach that allows the use of cases in a case base as heuristics to speed up Multiagent Reinforcement Learning algorithms, combining Case Based Reasoning (CBR) and Multiagent Reinforcement Learning (MRL) techniques. This approach, called Case Based Heuristically Accelerated Multiagent Reinforcement Learning (CB-HAMRL), builds upon an emerging technique, Heuristic Accelerated Reinforcement Learning (HARL), in which RL methods are accelerated by making use of heuristic information. CB-HAMRL is a subset of MRL that makes use of a heuristic function H derived from a case base, in a Case Based Reasoning manner. An algorithm that incorporates CBR techniques into the Heuristically Accelerated Minimax–Q is also proposed and a set of empirical evaluations were conducted in a simulator for the robot soccer domain, comparing the three solutions for this problem: MRL, HAMRL and CB-HAMRL. Experimental results show that using CB-HAMRL, the agents learn faster than using RL or HAMRL methods.