5 citations found. Retrieving documents...
A. McGovern and J. E. B. Moss. Scheduling straight-line code using reinforcement learning and rollouts. In S. Solla, editor, Advances in Neural Information Processing Systems 11, Cambridge, MA, 1998. MIT Press.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
STD(λ): learning state differences with TD(λ) - Weaver, Baxter   (Correct)

....= 0 r(B) 1 A B Figure 1: Transitions and rewards in the two state system. Klopf [Harmon et al. 1994] introduced advantaging updating which estimates the value of each state and the relative advantage of each action using separate approximation architectures. More recently McGovern and Moss [McGovern and Moss, 1998] have used temporal difference learning to develop an instruction scheduler for an optimising compiler. Their approach uses table lookup rather than function approximation, and combines possible successor states into a single feature vector which is mapped to a preference indicator. Bertsekas ....

Amy McGovern and Eliot Moss. Scheduling Straight-Line Code Using Reinforcement Learning and Rollouts. Advances in Neural Information Processing (NIPS '98), 11, 1998.


Learning From State Differences: - Weaver, Baxter (1999)   (Correct)

....the merits of approximating the gradient of the Q function, whilst Baird [18] Harmon, and Klopf [19] introduced advantaging updating which estimates the value of each state and the relative advantage of each action using separate approximation architectures. More recently McGovern and Moss [20] have used temporal difference learning to develop an instruction scheduler for an optimising compiler. Their approach uses tablelookup rather than function approximation, and combines possible successor states into a single feature vector which is mapped to a preference indicator. Bertsekas ....

Amy McGovern and Eliot Moss. Scheduling Straight-Line Code Using Reinforcement Learning and Rollouts. Advances in Neural Information Processing (NIPS '98), 11, 1998.


Inducing Heuristics To Decide Whether To Schedule - John Cavazos University   Self-citation (Moss)   (Correct)

No context found.

A. McGovern and J. E. B. Moss. Scheduling straight-line code using reinforcement learning and rollouts. In S. Solla, editor, Advances in Neural Information Processing Systems 11, Cambridge, MA, 1998. MIT Press.


Building a Basic Block Instruction Scheduler with.. - McGovern, Moss, Barto (1999)   Self-citation (Mcgovern Moss)   (Correct)

....and a set of (more than one) candidate instructions to add to the schedule. The scheduler appends each candidate to the partial schedule and then follows a fixed policy, p, to schedule the remaining instructions. When the schedule is complete, the scheduler evaluates the 2 In previous papers (McGovern and Moss, 1998; McGovern, Moss, and Barto, 1999) we referred to this as ORIG. This change is to reduce confusion. To appear in Machine Learning: Special Issue on Reinforcement Learning, 2000. 8 6 8 7 6 9 . 11 10 7 8 9 9 7.2 5.6 5 7 5 6 5 . rollout with policy p Partial schedule p Best average ....

McGovern, A. & Moss, E. (1998). Scheduling straight-line code using reinforcement learning and rollouts.


Basic-block Instruction Scheduling Using Reinforcement.. - McGovern, Moss, Barto (1999)   Self-citation (Mcgovern Moss Barto)   (Correct)

No context found.

Amy McGovern, Eliot Moss, and Andrew G. Barto. Scheduling straight-line code using reinforcement learning and rollouts. Technical Report 99-23, University of Massachusetts, Amherst, April 1999.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC