10 citations found. Retrieving documents...
S. S. Keerthi and B. Ravindran, A Tutorial Survey of Reinforcement Learning, to appear in : Sadhana.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
A Hybrid Symbolic Subsymbolic Controller for Complex.. - Apolloni, Piccolboni..   (Correct)

....the weights of the other network which is deputed to these proposals. Sophisticated policies for rewarding the control action proposals are studied in the literature based on relaxed versions of Dynamic Programming [55] However actual applications in the control field generally concern model free [64], possibly memoryless [143] rewarding methods. The joint use of neural networks and symbolic tools is a favourite approach in the control scientist community. Actually system theory boasts so many results that to ignore it completely very often appears wasteful. This cooperation might have ....

S. S. Keerthi, B. Ravindran, "A tutorial survey of reinforcement learning", TR Dept of Comp. Sci. and Automat., Indian Institute of Sci., Bangalore, 1995.


Neural Network Control Using Active Learning - RayChaudhuri, Hamey, Bell (1995)   (Correct)

....Networks 3 Reinforcement and Active Learning Systems Study of animal learning and human cognition has yielded the concept of reinforcement learning. A reinforcement learning system is one which learns an associative mapping by maximizing a scalar evaluation of its performance from the environment [5]. The concept of active learning is similar. An active learning system is one that can influence the training data it receives by actions or queries to its environment [2] If a module can be constructed which is able to design and conduct active experimentation and to decide when it is necessary ....

S. Sathiya Keerthi and B. Ravindran. A tutorial survey of reinforcement learning. Sadhana (published by the Indian Academy of Sciences), January 1995.


Learning Decentralized Goal-based Vector Quantization - Gupta, Borkar   (Correct)

....less than N j . We now discuss the following two methods to achieve the above goal: the credit assignment method, and the sequential adaptation method. Credit assignment method The basic idea has been adapted from the reinforcement learning literature (a good survey can be found in [13]) In the problems considered, the reinforcement signal r for an action comes random amount of time after the action has been taken. Hence credit for receiving r, is to be assigned among all the actions taken till the arrival of r. Such an assignment problem is usually referred as temporal credit ....

S. Sathiya Keerthi, and B. Ravindran, "A tutorial survey of reinforcement learning", S ¯ ADHAN ¯ A: Indian Academy of Science Proceedings in Engg. Sciences, 19(1994) 851-889.


Quo Vadis - A Framework for Intelligent Routing in Large.. - Mikler, Wong, Honavar (1997)   (Correct)

....adapt the tunable parameters (ff, fi, fl, used by Quo Vadis at each node in response to changes in network dynamics are of interest. In particular, variations of techniques drawn from adaptive control [White Sofge, 1992] and machine learning [Honavar, 1994] especially reinforcement learning [Keerthi Ravindran, 1994] are currently under investigation. For examples of preliminary work by other investigators on this topic, the reader is referred to [Littman and Boyan 1993; Lehman et al. 1993] In conclusion, it must be noted that Quo Vadis exemplifies a family of parameterized algorithms, different instances ....

Keerthi, S.S. and Ravindran, B. A Tutorial Survey of Reinforcement Learning. (preprint) (1994).


Quo Vadis - Adaptive Heuristics for Routing in Large.. - Mikler, Wong, Honavar (1995)   (Correct)

....the tunable parameters (ff, fi, fl, j, used by Quo Vadis at each node in response to changes in network dynamics are of interest. In particular, variations of techniques drawn from adaptive control [White Sofge, 1992] and machine learning [Honavar, 1994] especially reinforcement learning [Keerthi Ravindran, 1994] are currently under investigation. For examples of preliminary work by other investigators on this topic, the reader is refered to [Littman and Boyan 1993; Lehman et al. 1993] In conclusion, it must be noted that Quo Vadis exemplifies a family of parameterized algorithms, different instances of ....

Keerthi, S.S. and Ravindran, B. A Tutorial Survey of Reinforcement Learning. (preprint) (1994).


Cost-Effective Querying Leading To Dual Control - RayChaudhuri, Hamey (1996)   (Correct)

....choice of data. Currently there exist two overall separate recognised approaches to implementing active learning reinforcement learning and querying. Whereas the former refers broadly to a class of problems which involve policy based learning algorithms using stateaction transition models [10, 9], the latter is concerned with techniques of asking the most appropriate questions, usually for optimal information gain [23] While the two approaches must necessarily have elements in common, not much exists in the literature that addresses their unification. In this work we will adhere mostly ....

S. Sathiya Keerthi and B. Ravindran. A tutorial survey of reinforcement learning. Sadhana (published by the Indian Academy of Sciences), January 1995.


Network Performance Management Using Reinforcement Learning - Prem Kumar And   (Correct)

No context found.

S. S. Keerthi and B. Ravindran, A Tutorial Survey of Reinforcement Learning, to appear in : Sadhana.


A Short Introduction to Reinforcement Learning - Hagen, Kröse (1997)   (Correct)

No context found.

S. Sathya Keerthi and B. Ravindran. A tutorial survey of reinforcement learning. Sadhana, 19(6):851--889, 1994.


Parameterized Heuristics for Intelligent Adaptive Network.. - Armin Mikler   (Correct)

No context found.

Keerthi, S.S. and Ravindran, B. A Tutorial Survey of Reinforcement Learning. (preprint) (1994).


From Conventional Control to Intelligent Neurocontrol.. - RayChaudhuri, Hamey (1995)   (Correct)

No context found.

S. Sathiya Keerthi and B. Ravindran. A tutorial survey of reinforcement learning. Sadhana (published by the Indian Academy of Sciences), January 1995.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC