6 citations found. Retrieving documents...
S. Thrun, K. Moller, and A. Linden. Adaptive look ahead planning. In Proceedings OEGAI 90, 1990.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
On Planning And Exploration In Non-Discrete Environments - Thrun, Möller (1991)   (4 citations)  Self-citation (Thrun Moller)   (Correct)

....has received considerable attention in the last few years [And86, Bar89, Sut84] In general there are two principles to solve reinforcement learning problems: direct and indirect techniques, both having their advantages and disadvantages. We present a system that combines both methods [TML91, TML90] By interaction with an unknown environment a world model is progressively constructed using the backpropagation algorithm. For optimizing actions with respect to future reinforcement planning is applied in two steps: An experience network proposes a plan which is subsequently optimized by ....

....Planning has also been used for reinforcement learning. e.g. Sutton [Sut90] uses off line planning for improving the controller without interacting with the world. In this article we will present a planning technique which relys on a combination of direct and indirect learning control [TML91, TML90] A model network which approximates the behavior of the world is used for looking ahead into future and optimizing actions by gradient descent with respect to future reinforcement. In addition, an experience network is trained like a controller but used for accelerating 1 Even this is a science ....

[Article contains additional citation context not shown here]

S. Thrun, K. Moller, and A. Linden. Adaptive look-ahead planning. In Proceedings OEGAI 90, 1990.


Learning By Error-Driven Decomposition - Fox, Heinze, Möller, Thrun, Veenker (1991)   (3 citations)  Self-citation (Thrun)   (Correct)

....to backpropagation is sped up by using many small networks in a modular fashion rather than a single large one. ffl Execution speed is enhanced by modularization, and it is possible to perform faster (gradient directed) search in input space, as it is used by a planning method described in[TML90, TML91]. ffl The problem of determining the number of hidden units is circumvented. The system performs an adaptive resource allocation in a way such that difficult parts of a function attract more networks (or hidden units) than easier ones. Destructive constructive extensions can be incorporated ....

S. Thrun, K. Moller, and A. Linden. Adaptive look ahead planning. In G. Dorffner, editor, Konnektionismus in Artificial Intelligence, Springer-Verlag, Berlin, 1990.


Learning By Error-Driven Decomposition - Fox, Heinze, Möller, Thrun, Veenker (1991)   (3 citations)  Self-citation (Thrun)   (Correct)

....to backpropagation is sped up by using many small networks in a modular fashion rather than a single large one. ffl Execution speed is enhanced by modularization, and it is possible to perform faster (gradient directed) search in input space, as it is used by a planning method described in[TML90, TML91]. ffl The problem of determining the number of hidden units is circumvented. The system performs an adaptive resource allocation in a way such that difficult parts of a function attract more networks (or hidden units) than easier ones. Destructive constructive extensions can be incorporated ....

S. Thrun, K. Moller, and A. Linden. Adaptive look-ahead planning. In G. Dorffner, editor, Konnektionismus in Artificial Intelligence, Springer-Verlag, Berlin, 1990.


A General Feed-Forward Algorithm for Gradient Descent in.. - Thrun, Smieja (1990)   (2 citations)  Self-citation (Thrun)   (Correct)

No context found.

S. Thrun, K. Moller, and A. Linden. Adaptive look ahead planning. In Proceedings OEGAI 90, 1990.


Planning with an Adaptive World Model - Thrun (1991)   (5 citations)  Self-citation (Thrun Moller Linden)   (Correct)

....for Computer Science (GMD) D 5205 St. Augustin, FRG Knut Moller University of Bonn Department of Computer Science D 5300 Bonn, FRG Alexander Linden German National Research Center for Computer Science (GMD) D 5205 St. Augustin, FRG Abstract We present a new connectionist planning method [TML90]. By interaction with an unknown environment, a world model is progressively constructed using gradient descent. For deriving optimal actions with respect to future reinforcement, planning is applied in two steps: an experience network proposes a plan which is subsequently optimized by gradient ....

....is the time of the sth action. Thus, for each action (8i; s) its influence on later activations (8j; 8 s) of the chain of networks, including all predictions, is measured by j is ( It has been shown in an earlier paper that this gradient can easily be propagated forward through the network [TML90]: j is ( 8 : ffi ij ffi s if j action input unit 0 if =1 j state context input unit j 0 is ( 1) if 1 j state context input unit (j 0 corresponding output unit of preceding model) logistic 0 (net j ( Delta X l2pred(j) weight jl l is ( ....

S. Thrun, K. Moller, and A. Linden. Adaptive look-ahead planning. In G. Dorffner, editor, Proceedings KONNAI/OEGAI, Springer, Sept. 1990.


A General Feed-Forward Algorithm for Gradient Descent in.. - Thrun, Smieja (1990)   (2 citations)  (Correct)

No context found.

S. Thrun, K. Moller, and A. Linden. Adaptive look ahead planning. In Proceedings OEGAI 90, 1990.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC