Learning to Act Stochastically
| Citations: | 1 - 1 self |
BibTeX
@MISC{Dickens_learningto,
author = {Luke Dickens},
title = {Learning to Act Stochastically},
year = {}
}
OpenURL
Abstract
This thesis examines reinforcement learning for stochastic control processes with single and multiple agents, where either the learning outcomes are stochastic policies or learning is perpetual and within the domain of stochastic policies. In this context, a policy is a strategy for processing environmental outputs (called observations) and subsequently generating a response or input-signal to the environment (called actions). A stochastic policy gives a probability distribution over actions for each observed situation, and the thesis concentrates on finite sets of observations and actions. There is an exclusive focus on stochastic policies for two principle reasons: such policies have been relatively neglected in the existing literature, and they have been recognised to be especially important in the field of multi-agent reinforcement learning. For the latter reason, the thesis concerns itself primarily with solutions best suited to multi-agent domains. This restriction proves essential, since the topic is otherwise too broad to be covered in depth without losing some clarity and focus. The thesis is partitioned into 3 parts, with chapter of contextual information preceding the first part. Part 1, focuses on analytic and formal mathematical approaches







