Results 1 -
5 of
5
Temporal sequence learning, prediction and control - a review of different models and their relation to biological mechanisms
- Neural Computation
, 2004
"... In this article we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spiketiming dependent plasticity. This review will briefly introduce the most influential models and focus on two questions: 1) T ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
In this article we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spiketiming dependent plasticity. This review will briefly introduce the most influential models and focus on two questions: 1) To what degree are reward-based (e.g. TD-learning) and correlation based (hebbian) learning related? and 2) How do the different models correspond to possibly underlying biological mechanisms of synaptic plasticity? We will first compare the different models in an open-loop condition, where behavioral feedback does not alter the learning. Here we observe, that reward-based and correlation based learning are indeed very similar. Machine-control is then used to introduce the problem of closed-loop control (e.g. “actor-critic architectures”). Here the problem of evaluative (“rewards”) versus nonevaluative (“correlations”) feedback from the environment will be discussed showing that both learning approaches are fundamentally different in the closed-loop condition. In trying to answer the second question we will compare neuronal versions of the different learning architectures to the anatomy of the involved brain structures (basal-ganglia, thalamus and
Isotropic Sequence Order Learning
, 2003
"... In this article, we present an isotropic unsupervised algorithm for temporal sequence learning. Nospecial reward signal is used such that all inputs are completely isotropic. All input signals are bandpass filtered before converging onto a linear output neuron. All synaptic weights change according ..."
Abstract
-
Cited by 12 (8 self)
- Add to MetaCart
In this article, we present an isotropic unsupervised algorithm for temporal sequence learning. Nospecial reward signal is used such that all inputs are completely isotropic. All input signals are bandpass filtered before converging onto a linear output neuron. All synaptic weights change according to the correlation of bandpass-filtered inputs with the derivative of the output. We investigate the algorithm in an open- and a closed-loop condition, the latter being defined by embedding the learning system into a behavioral feedback loop. In the open-loop condition, we find that the linear structure of the algorithm allows analytically calculating the shape of the weight change, which is strictly heterosynaptic and follows the shape of the weight change curves found in spike-time-dependent plasticity. Furthermore, we show that synaptic weights stabilize automatically when no more temporal differences exist between the inputs without additional normalizing measures. In the second part of this study, the algorithm is is placed in an environment that leads to closed sensormotor loop. To this end, a robot is programmed with a prewired retraction reflex reaction in response to collisions. Through isotropic sequence order (ISO) learning, the robot achieves collision avoidance by learning the correlation between his early range-finder signals and the later occurring collision signal. Synaptic weights stabilize at the end of learning as theoretically predicted. Finally, we discuss the relation of ISO learning with other drive reinforcement models and with the commonly used temporal difference learning algorithm. This study is followed up by a mathematical analysis of the closed-loop situation in the companion article in this issue, “ISO Learning Approximates a Solution to the Inverse-Controller Problem in an Unsupervised Behavioral Paradigm” (pp. 865–884).
Actor-Critic models of animal control -- a critique of reinforcement learning
- PROCEEDING OF FOURTH INTERNATIONAL ICSC SYMPOSIUM ON ENGINEERING OF INTELLIGENT SYSTEMS
, 2004
"... In this article we will compare traditional reinforcement learning techniques with a novel correlation based algorithm. We will discuss several problems which occur in reward-based reinforcement learning and outline alternative solutions. An example of a robot control task shown at the end will supp ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this article we will compare traditional reinforcement learning techniques with a novel correlation based algorithm. We will discuss several problems which occur in reward-based reinforcement learning and outline alternative solutions. An example of a robot control task shown at the end will support our claims.
A Neural Model for the Adaptive Control of Saccadic Eye Movements
"... Abstract — Several studies have suggested different cost functions to explain the kinematic characteristics of saccades. However, these studies do not present any neural implementation of the optimization procedure they use. Instead, they are based on optimal control theory approaches that provide a ..."
Abstract
- Add to MetaCart
Abstract — Several studies have suggested different cost functions to explain the kinematic characteristics of saccades. However, these studies do not present any neural implementation of the optimization procedure they use. Instead, they are based on optimal control theory approaches that provide a global analytical solution rather than a local adaptation scheme. In this study, we propose a model comprised of an open-loop neural controller and an adaptation unit. The neural controller receives the initial target position as input. The adaptation unit, which is the neural interpretation of a simple cost function, evaluates the optimality of this controller and induces weight changes in the controller via a local learning rule. Realistic saccades are obtained with the proposed model. We speculate that the superior colliculus and the cerebellum behave quite similar to our model’s neural controller and adaptation unit. I.

