11 citations found. Retrieving documents...
D. P. Bertsekas, J. N. Tsitsiklis, C. Wu, Rollout Algorithms for Combinatorial Optimization, Journal of Heuristics 3 (1997) 245--262.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Heuristic Search in Infinite State Spaces Guided by Lyapunov.. - Perkins, Barto (2001)   (Correct)

....does not construct a path to a goal state by the time that a sufficient number of updates are performed, then Theorem 3 guarantees the completion of the path afterward. A second method for generating a CLF that meets the conditions of Theorem 3 is to perform roll outs [Tesauro and Galperin, 1996; Bertsekas et al. 1997] In the present context, this means defining L 0 (s) the cost of the solution path generated by repeated application of O 1 starting from s and until it reaches G . The function L 0 is evaluated by actually constructing the path. The reader may verify that L 0 meets the definition of a ....

....of Theorem 3. Performing roll outs can be an expensive method for evaluating leaves because an entire path to a goal state needs to be constructed for each evaluation. However, roll outs have been found to be quite effective in both game playing and sequential control [Tesauro and Galperin, 1996; Bertsekas et al. 1997] 6 Robot Arm Example We briefly illustrate the theory presented above by applying it to a problem requiring the control of the simulated 3 link robot arm, depicted in Figure 1. The state space of the arm is 6 , corresponding to three angular joint positions and three angular joint ....

D. P. Bertsekas, J. N. Tsitsiklis, and C. Wu. Rollout algorithms for combinatorial optimization. Journal of Heuristics, 1997.


Revenue Management in a Dynamic Network Environment - Bertsimas, Popescu (2001)   (1 citation)  (Correct)

.... method is simply a form of policy iteration and is described in detail in Bertsekas and Tsitsiklis [4] It has been observed in the dynamic programming literature that this procedure systematically improves heuristic performance (see also Bertsimas, Teo and Vohra [6] Bertsekas, Tsitsklis and Wu [5], Bertsekas and Castanon [3] For practical purposes, we suggest using Monte Carlo simulation for evaluating the policy value H for a subset of states, and then interpolating these in an on line fashion. An interesting research idea is to investigate what types of preprocessing simulations would ....

D.P. Bertsekas, J.N. Tsitsiklis, and C. Wu. Rollout algorithms for combinatorial optimization. Journal of Heuristics, (3):245-262, 1997.


Scheduling Straight-Line Code Using Reinforcement Learning .. - McGovern, Moss, Barto (1999)   (2 citations)  (Correct)

....user s coding characteristics to build schedules better tuned for that UMass Amherst Tech Report Number 99 23 2 user. With these motivations in mind, we formulated and tested two autonomous methods of building an instruction scheduler. The first method used rollouts (Tesauro and Galperin, 1996, Bertsekas, et al. 1997a,b) and the second focused on reinforcement learning (RL) Sutton Barto, 1998) Both methods were implemented for the Digital Alpha 21064. The next section gives a domain overview and discusses results using supervised learning on the same task. 2 Domain overview We focused on scheduling ....

....our algorithms to schedule blocks whose size is greater than 10, we focus on scheduling the longer running blocks. We present timing results using the simulator. 1 3 Rollouts Rollouts are a form of Monte Carlo search, first introduced by Tesauro and Galperin (1996) for use in backgammon. Bertsekas, et al. 1997a,b) explored rollouts in other domains and proved important theoretical results. In the instruction scheduling domain, rollouts work as follows: suppose the scheduler comes to a point where it has a partial schedule and a set of (more than one) candidate instructions to add to the schedule. For ....

[Article contains additional citation context not shown here]

Bertsekas, D. P., Tsitsiklis, J. N. & Wu, C. (1997). Rollout algorithms for combinatorial optimization.


Building a Basic Block Instruction Scheduler with.. - McGovern, Moss, Barto (1999)   (Correct)

....of the compiler while potentially sacrificing some quality of the final schedule. With these motivations in mind, we formulated and tested two methods of building an instruction scheduler. The first method used rollouts (Woolsey, 1991; Abramson, 1990; Galperin, 1994; Tesauro and Galperin, 1996; Bertsekas et al. 1997a,b) and the second focused on reinforcement learning (RL) Sutton and Barto, 1998) We also investigated the effect of combining the two methods. All methods were implemented for the Compaq Alpha 21064. These methods address the time tradeoff directly. Rollouts evaluate schedules online during ....

....up the simulator considerably. 3 Rollouts Rollouts are a form of Monte Carlo search, first introduced in the backgammon literature (Woolsey, 1991; Galperin, 1994; Tesauro and Galperin, 1996) In other domains, Abramson (1990) studied what we call RANDOM p (below) in a game playing context, and Bertsekas et al. 1997a,b) proved important theoretical results for rollouts. In the instruction scheduling domain, rollouts work as follows: suppose the scheduler comes to a point where it has a partial schedule and a set of (more than one) candidate instructions to add to the schedule. The scheduler appends each ....

[Article contains additional citation context not shown here]

Bertsekas, D. P., Tsitsiklis, J. N. & Wu, C. (1997). Rollout algorithms for combinatorial optimization.


Approximate Dynamic Programming for the Solution of.. - Stephen Patek David   (Correct)

....severely complicate matters, and, since the methods of dynamic programming are feasible for only very small problems, we are faced with the necessity of using approximations. The approach we take in this paper is one of Neuro Dynamic Programming [1] and, more speci cally, rollout algorithms [2, 3]. In Section 2 we formally pose the stochastic multiplatform path planning problem. In Section 3, we describe the various solution methodologies that we apply in Section 4 to a speci c instance of the problem. Our results indicate that, as one would expect, the qualitative nature of the optimal ....

....J k 1 [ k 1) j u : 5) The minimum in this equation may not be unique. This policy derives its name from the fact that, in practice, the evaluations J k [ k) k] often are not computed precisely but are estimated through online Monte Carlo simulation of the original policy . cf. [2] and [3] Thus, the evaluations are made by rolling out the die. Whenever we refer to Policy 3 in this paper, we imply that the evaluations of the base policy are exact. 4 Experimental Results We present in this section some computational results for the path planning problem of Section 2. We ....

D. P. Bertsekas, J. N. Tsitsiklis, and C. Wu, \Rollout Algorithms for Combinatorial Optimization," J. of Heuristics, Vol. 3, 1997, pp. 245-262.


Statistical Machine Learning for Large-Scale Optimization - Baluja, Barto, Boese.. (2000)   (Correct)

....search, can signi cantly enhance local search performance for combinatorial optimization problems. Other DARP Case Studies We also investigated two other learning based enhancements to combinatorial optimization algorithms, again using DARP as our test problem. We considered the rollout method [109, 108, 106], and we used it to extend a very e ective constructive DARP algorithm developed by Kubo and Kasugai [110] Although our rollout extension is extremely long running, it signi cantly outperforms the best algorithm reported in [110] Indeed even a drastically truncated rollout algorithm outperforms ....

Bertsekas, D. P., Tsitsiklis, J. N., and Wu, C. (1997). Rollout Algorithms for Combinatorial Optimization. Journal of Heuristics.


Scheduling Straight-Line Code Using Reinforcement Learning and .. - Amy Mcgovern (1999)   (2 citations)  (Correct)

....at the highest level of optimization. We call the schedules output by the compiler ORIG. This collection has 447,127 basic blocks, containing 2,205,466 instructions. 3 Rollouts Rollouts are a form of Monte Carlo search, first introduced by Tesauro and Galperin (1996) for use in backgammon. Bertsekas et al. 1997a,b) have explored rollouts in other domains and proved important theoretical results. In the instruction scheduling domain, rollouts work as follows: suppose the scheduler comes to a point where it has a partial schedule and a set of (more than one) candidate instructions to add to the schedule. ....

....many times for each instruction to achieve a measure of the average expected outcome. After rolling out each candidate, the scheduler picks the one with the best average running time. Our first set of rollout experiments compared three different rollout policies . Although the theory developed by Bertsekas et al. 1997a,b) proved that if we used the DEC scheduler as , we would perform no worse than DEC, an architect proposing a new machine might not have a good heuristic available to use as , so we also considered policies more likely to be available. The first was the random policy, RANDOM , which is a ....

Bertsekas, D. P., Tsitsiklis, J. N. & Wu, C. (1997). Rollout algorithms for combinatorial optimization.


Differential Training Of Rollout Policies - Bertsekas (1997)   (12 citations)  Self-citation (Bertsekas)   (Correct)

.... The book by Bertsekas and Tsitsiklis [BeT96] discusses in some detail various aspects of rollout policies in a stochastic context, and also in a deterministic combinatorial optimization context, as a device for magnifying the e#ectiveness of heuristics (see also Bertsekas, Tsitsiklis, and Wu [BTW97]) A standard policy iteration argument can be used to show that the rollout policy # is an improved policy over the base policy #. In particular, let J k (x) be the cost to go of the rollout policy # starting from a state x k at time k. It can be shown that J k (x) # J k (x) for all x and k, ....

Bertsekas, D. P., Tsitsiklis, J. N., and Wu, C., 1997. "Rollout Algorithms for Combinatorial Optimization," Report LIDS-P-2386, Lab. for Information and Decision Systems, Mass. Institute of Technology, Cambridge, MA; to appear in Heuristics.


A Branch and Bound Method for Stochastic Integer Problems.. - Beraldi, Ruszczynski (2001)   (1 citation)  (Correct)

No context found.

D. P. Bertsekas, J. N. Tsitsiklis, C. Wu, Rollout Algorithms for Combinatorial Optimization, Journal of Heuristics 3 (1997) 245--262.


Statistical Machine Learning for Large-Scale Optimization - Baluja, Barto, Boese.. (2000)   (Correct)

No context found.

Bertsekas, D. P., Tsitsiklis, J. N. & Wu, C. (1997). Rollout algorithms for combinatorial optimization. Journal of Heuristics.


Basic-block Instruction Scheduling Using Reinforcement.. - McGovern, Moss, Barto (1999)   (Correct)

No context found.

Dmitri P. Bertsekas, John N. Tsitsiklis, and C. Wu. Rollout algorithms for combinatorial optimization. Journal of Heuristics, April 1997.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC