Solving flow-shop scheduling problem with a reinforcement learning algorithm that generalizes the value function with neural network

Jianfeng Ren; Chunming Ye; Feng Yang

Alexandria Engineering Journal (Jun 2021)

Solving flow-shop scheduling problem with a reinforcement learning algorithm that generalizes the value function with neural network

Jianfeng Ren,
Chunming Ye,
Feng Yang

Affiliations

Jianfeng Ren: School of Business, University of Shanghai for Science and Technology, Shanghai 200093, China; School of Computer and Information Engineering, Henan University of Economics and Law, Zhengzhou 450018, China
Chunming Ye: School of Business, University of Shanghai for Science and Technology, Shanghai 200093, China; Corresponding author.
Feng Yang: School of Management, Henan University of Chinese Medicine, Zhengzhou 450018, China

Journal volume & issue: Vol. 60, no. 3
pp. 2787 – 2800

Abstract

Read online

This paper solves the flow-shop scheduling problem (FSP) through the reinforcement learning (RL), which approximates the value function with neural network (NN). Under the RL framework, the state, strategy, action, reward signal, and value function of FSP were described in details. Considering the intrinsic features of FSP, various information of FSP was mapped into RL states, including the maximum, minimum, and mean of makespan, the maximum, minimum, and mean of remaining operations, as well as the load of machines. Besides, the optimal scheduling rules corresponding to specific states were mapped into the actions of RL. On this basis, the NN was trained to establish the mapping between states and actions, and select the action of the highest probability under a specific state. In addition, a reward function was constructed based on the idle time (IT) of machines, and the value function was generalized by the NN. Finally, our algorithm was tested on 23 benchmark examples and more than 7 sets of example machines. Small relative errors were achieved on 20 of the 23 benchmark examples and satisfactory results were realized on all 7 machine sets. The results confirm the superiority and universality of our algorithm, and indicate that FSP can be solved effectively by completely mapping it into our RL framework. The research results provide a reference for solving similar problems with RL algorithm based on value function approximation.

Published in Alexandria Engineering Journal

ISSN: 1110-0168 (Print); 2090-2670 (Online)
Publisher: Elsevier
Country of publisher: Egypt
LCC subjects: Technology: Engineering (General). Civil engineering (General)
Website: http://www.journals.elsevier.com/alexandria-engineering-journal/

About the journal

Abstract

Keywords