| P. Gaussier, A. Revel, C. Joulain, and S. Zrehen, `Living in a partially structured environment: How to bypass the limitations of classical reinforcement techniques', Robotics and Autonomous Systems, 20, (1997). |
....RNNs are as black box as any neural network. There are other approaches dealing with learning in the autonomous robot control context. Some kind of extended short term memory is needed as a prerequisite to handle non trivial time related dependencies between sensor readings and the world. In [4, 8] this is similar to our RNN based approach, but direct sensory motorconnection are learned which are strongly related to the concrete environment, the particular behavior system and the current robot task. Making knowledge explicit , like in the symbol grounding case, overcomes most of such ....
P. Gaussier, A. Revel, C. Joulain, and S. Zrehen, `Living in a partially structured environment: How to bypass the limitations of classical reinforcement techniques', Robotics and Autonomous Systems, 20, (1997).
....ou par la gauche. 17] De plus, nous avons d evelopp e une r egle de conditionnement probabiliste qui permet a notre robot d apprendre des associations sensori motrices en fonction d un signal retard e. Cet al..gorithme a et e test e avec succ es dans des probl emes de labyrinthes r eels [20]. En ce qui concerne la navigation en environnement ouvert, nous avons montr e que notre architecture neuronale pouvait etre utilis ee pour permettre a notre robot d atteindre n importe quel lieu avec une grande pr ecision en utilisant des amers visuels naturels. Cependant, la d ecouverte de ....
P. Gaussier, A. Revel, C. Joulain, and S. Zrehen. Living in a partially structured environment: How to bypass the limitation of classical reinforcement techniques. Robotics and Autonomous Systems, 20:225--250, 1997.
....during the execution of a go ahead movement , the robot must be able to turn slightly right or left. As the environment is structured (only corridors and T junctions) a corridor following reflex can be implemented using information on the location of the vanishing point in the image (see (Gaussier et al. 1996)) This reflex could also have been learned as a simple conditioning problem (learning the correct association scheme in order not to collide with the walls) Our architecture supposes that a first level of shaping has already been performed and that its result is included in the background of the ....
....at a time. Practically, the solution to the maze problem is found quickly if there are not more than 3 or 4 pictogram categories. This condition is respected first due to Gabor filtering and also because categories are coded on a Probabilistic Topological Map (PTM (Gaussier Zrehen, 1994; Gaussier et al. 1996; Zrehen, 1995) The interest is that this map has topology preservation properties. In fact, when two input patterns are similar they are coded on neurons that are close to each other, and, due to a diffusion mechanism, they involve the same reaction. Thus, PTM allows to take for free the ....
[Article contains additional citation context not shown here]
Gaussier, P., Revel, A., Joulain, C., & Zrehen, S. 1996. Living in a partially structured environment: How to bypass the limitation of classical reinforcement techniques. to appear in robotics and autonomous systems.
.... It is easy to build a learning rule that is triggered when the sum of the place cells response decreases [9] The robot would then find a movement that allows it to go in a direction associated to a global increase of the goal recognition (an efficient reinforcement learning rule is described in [7]) Our future work will consist in testing for real a planification level allowing the robot to pass from one subgoal to another in order to reach a particular goal. ....
P. Gaussier, A. Revel, C. Joulain, and S. Zrehen. Living in a partially structured environment: How to bypass the limitation of classical reinforcement techniques. to appear in Robotics and Autonomous Systems, 1997.
....l algorithme nous avons consid er e que le probl eme de la cat egorisation etait d ej a r esolu. Cependant, dans une exp erience avec un robot r eel, la cat egorisation est une etape d ecisive car elle permet de r eduire drastiquement la compl exit e d analyse d une sc ene. Dans d autres travaux [1, 2] nous montrons comment PCR peut s ins erer dans une architecture hi erarchique pour g erer la navigation de notre robot dans un labyrinthe r eel. Cependant, nous mettons en evidence qu une telle architecture n ecessite que l apprentissage se fasse selon une technique de Shaping (apprentissage ....
P. Gaussier, A. Revel, C. Joulain, and B. Gas. Living in a partially structured environment: How to bypass the limitation of classical reinforcement techniques. submitted to Robotics and Autonomous Systems, 1996.
....solution consists in building a representation of the transition between two situations. Let AB, the internal representation of the transition between A and B. The associated action (the movement allowing to go from A to B) can be learned using, for instance, a probabilistic conditioning rule (see [8] for details) During planning, a motivation backpropagation mechanism (use of the cognitive map to plan [15] activates the neuron indicating the movement that it is necessary to perform in order to reach the goal. The systems learns to predict the possible transition(s) from the current ....
P. Gaussier, A. Revel, C. Joulain, and S. Zrehen. Living in a partially structured environment: How to bypass the limitation of classical reinforcement techniques. Robotics and Autonomous Systems, 20:225--250, 1997.
No context found.
Gaussier, P., Revel, A., Joulain, C., & Zrehen, S. 1997c. Living in a partially structured environment: How to bypass the limitation of classical reinforcement techniques. Robotics and Autonomous Systems, 225--250.
....de nouveaut e d une forme pour qu elle soit apprise. 2. 2 Choix d une action Chaque SV est connect ee par un lien inhibiteur et un lien activateur a la sortie motrice (SMR1) Ces connections sont des liens probabilistes, c est une version analogique des liens PCR (Probabilistic Conditioning Rule [3]) Dans notre cas, la Sortie Motrice du Robot num ero 1 (SMR1) est r eduite a trois neurones, tourner de 90 o a gauche , aller tout droit et tourner de 90 o a droite . Comme le montre la figure 1b, le robot fait une exploration s equentielle de plusieurs zones, a chacune de ces zones ....
....avec un lien inhibiteur qui sera toujours inhibiteur ou nul et un lien activateur qui lui aussi sera toujours activateur ou nul. Ces liens inhibiteurs sont indispensables, ils evitent que le premier mouvement appris soit toujours le mouvement effectu e (pour une explication plus d etaill ee voir [3]) Mais ces liens apportent aussi une plus grande vitesse d apprentissage car l activit e li ee a un mauvais mouvement sera diminu ee, ainsi les chances qu il soit de nouveau activ e par la forme sont plus faibles. L exp erimentation montre aussi que le robot est en mesure de d eterminer quelles ....
P. Gaussier, A. Revel, C. Joulain, and B. Gas. Living in a partially structured environment: How to bypass the limitation of classical reinforcement techniques. submitted to Robotics and Autonomous Systems, 1996.
.... It is easy to build a learning rule that is triggered when the sum of the place cells response decreases [9] The robot would then find a movement that allows it to go in a direction associated to a global increase of the goal recognition (an efficient reinforcement learning rule is described in [7]) Our future work will consist in testing for real a planification level allowing the robot to pass from one subgoal to another in order to reach a particular goal. ....
P. Gaussier, A. Revel, C. Joulain, and S. Zrehen. Living in a partially structured environment: How to bypass the limitation of classical reinforcement techniques. to appear in Robotics and Autonomous Systems, 1997.
....to learn with a delay. This also precludes learning intermediate steps, such as would be required in a maze environment, where some junctions would contain no interesting targets. However, we have devised a modified version of the algorithm that allows learning with a delay in maze environments (Gaussier, Revel et al. 1996). Further study will aim at implementing the proposed motivational module to a maze environment. 3. Learning is directly linked to the satisfaction of an active drive. A more realistic system should allow a form of latent learning, that varies with the intensity of the drives. 4. The interactions ....
Gaussier, P., A. Revel, et al. (1996). "Living in a partially structured environment: how to bypass the limitations of classical reinforcement techniques." to appear in Robotics and Autonomous Systems.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC