Results 1 
7 of
7
Transparent Modelling of Finite Stochastic Processes for Multiple Agents
, 2008
"... Abstract. Stochastic Processes are ubiquitous, from automated engineering, through financial markets, to space exploration. These systems are typically highly dynamic, unpredictable and resistant to analytic methods; coupled with a need to orchestrate long control sequences which are both highly com ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Stochastic Processes are ubiquitous, from automated engineering, through financial markets, to space exploration. These systems are typically highly dynamic, unpredictable and resistant to analytic methods; coupled with a need to orchestrate long control sequences which are both highly complex and uncertain. This report examines some existing single and multiagent modelling frameworks, details their strengths and weaknesses, and uses the experience to identify some fundamental tenets of good practice in modelling stochastic processes. It goes on to develop a new family of frameworks based on these tenets, which can model single and multiagent domains with equal clarity and flexibility, while remaining close enough to the existing frameworks that existing analytic and learning tools can be applied with little or no adaption. Some simple and larger examples illustrate the similarities and differences of this approach, and a discussion of the challenges inherent in developing more flexible tools to exploit these new frameworks concludes matters. 1
Exploiting Symmetries in Two Player ZeroSum Markov Games with an Application to Robot Soccer
"... Abstract. The exploitation of symmetries in Markov decision processes has proven to be a powerful concept for state space reduction without compromising solution optimality. In this paper we show how to perform a symmetry reduction for two player zerosum Markov games with additional symmetry proper ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract. The exploitation of symmetries in Markov decision processes has proven to be a powerful concept for state space reduction without compromising solution optimality. In this paper we show how to perform a symmetry reduction for two player zerosum Markov games with additional symmetry properties. The concept of equivariance symmetry justifies for the first time not only to lift the standard MDP symmetries to the Markov game case but also a qualitatively new symmetry: the exchange of two adversary players. The proof technique also deviates from the standard approach, for it directly utilizes the uniqueness of Bellman’s equation. We apply our reduction procedure to a multi player robot soccer model. 1
Learning to Act Stochastically
"... This thesis examines reinforcement learning for stochastic control processes with single and multiple agents, where either the learning outcomes are stochastic policies or learning is perpetual and within the domain of stochastic policies. In this context, a policy is a strategy for processing envir ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
This thesis examines reinforcement learning for stochastic control processes with single and multiple agents, where either the learning outcomes are stochastic policies or learning is perpetual and within the domain of stochastic policies. In this context, a policy is a strategy for processing environmental outputs (called observations) and subsequently generating a response or inputsignal to the environment (called actions). A stochastic policy gives a probability distribution over actions for each observed situation, and the thesis concentrates on finite sets of observations and actions. There is an exclusive focus on stochastic policies for two principle reasons: such policies have been relatively neglected in the existing literature, and they have been recognised to be especially important in the field of multiagent reinforcement learning. For the latter reason, the thesis concerns itself primarily with solutions best suited to multiagent domains. This restriction proves essential, since the topic is otherwise too broad to be covered in depth without losing some clarity and focus. The thesis is partitioned into 3 parts, with chapter of contextual information preceding the first part. Part 1, focuses on analytic and formal mathematical approaches
Algorithms, Theory
"... This paper introduces a multiagent reinforcement learning algorithm that converges with a given accuracy to stationary Nash equilibria in generalsum discounted stochastic games. Under some assumptions we formally prove its convergence to Nash equilibrium in selfplay. We claim that it is the first ..."
Abstract
 Add to MetaCart
(Show Context)
This paper introduces a multiagent reinforcement learning algorithm that converges with a given accuracy to stationary Nash equilibria in generalsum discounted stochastic games. Under some assumptions we formally prove its convergence to Nash equilibrium in selfplay. We claim that it is the first algorithm that converges to stationary Nash equilibrium in the general case.
Qlearning in TwoPlayer TwoAction Games Monica Babes Rutgers University
"... Qlearning is a simple, powerful algorithm for behavior learning. It was derived in the context of single agent decision making in Markov decision process environments, but its applicability is much broader—in experiments in multiagent environments, Qlearning has also performed well. Our preliminar ..."
Abstract
 Add to MetaCart
(Show Context)
Qlearning is a simple, powerful algorithm for behavior learning. It was derived in the context of single agent decision making in Markov decision process environments, but its applicability is much broader—in experiments in multiagent environments, Qlearning has also performed well. Our preliminary analysis using dynamical systems finds that Qlearning’s indirect control of behavior via estimates of value contributes to its beneficial performance in generalsum 2player games like the Prisoner’s Dilemma.