MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  A reinforcement learning approach to job-shop scheduling (1995) [80 citations — 7 self]

Download:
Download as a PDF | Download as a PS
by Wei Zhang, Thomas G. Dietterich
In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence
http://www.cs.orst.edu/~tgd/publications/ijcai95-jss.ps.gz
Add To MetaCart

Abstract:

We apply reinforcement learning methods to learn domain-specific heuristics for job shop scheduling. A repair-based scheduler starts with a critical-path schedule and incrementally repairs constraint violations with the goal of finding a short conflict-free schedule. The temporal difference algorithm TD() is applied to train a neural network to learn a heuristic evaluation function over states. This evaluation function is used by a one-step lookahead search procedure to find good solutions to new scheduling problems. We evaluate this approach on synthetic problems and on problems from a NASA space shuttle payload processing task. The evaluation function is trained on problems involving a small number of jobs and then tested on larger problems. The TD scheduler performs better than the best known existing algorithm for this task---Zweben's iterative repair method based on simulated annealing. The results suggest that reinforcement learning can provide a new method for constructing high-performance scheduling systems.

Citations

885 Learning to Predict by the Methods of Temporal Differences – Sutton - 1988
287 Practical issues in temporal difference learning – Tesauro - 1992
170 Generalization in reinforcement learning: Safely approximating the value function – Boyan, Moore - 1995
92 Knowledge-Based Training of Artificial Neural Networks for Autonomous Robot Driving – Pomerleau - 1993
87 Scheduling and Rescheduling with Iterative Repairs – Zweben, Davis, et al. - 1994
48 Reinforcement learning applied to linear quadratic regulation – Bradtke
8 The space shuttle ground processing scheduling system – Deale, Yvanovich, et al. - 1994
2 Using TD() to learn an evaluation function for the game of Go – Schraudolph, Dayan, et al. - 1994
2 Issues in using approximation for reinforcement learning – Thrun, Schwartz - 1993
1 A Boyan and A W Moore Generalization in reinforcement learning safely approximating the value function – unknown authors - 1995
1 Reinforc ement learning applied to linear quadratic regulation – Bradtke - 1993
1 A Pomerleau Efficient training of artificial neural networks for autonomous navigation – unknown authors - 1991
1 P Dayan and T Sejnowski Using to learn an evalu ation function for the game of go – Schraudolph - 1994