IEEE Access (Jan 2018)
Reinforcement Learning-Based and Parametric Production-Maintenance Control Policies for a Deteriorating Manufacturing System
Abstract
The model of a stochastic production/inventory system that is subject to deterioration failures is developed and examined in this paper. Customer interarrival times are assumed to be random and backorders are allowed. The system experiences a number of deterioration stages before it ultimately fails and is rendered inoperable. Repair and maintenance activities restore the system to its initial and previous deterioration state, respectively. The duration of both repair and maintenance is assumed to be stochastic. We address the problem of minimizing the expected sum of two conflicting objective functions: the average inventory level and the average number of backorders. The solution to this problem consists of finding the optimal tradeoff between maintaining a high service level and carrying as low inventory as possible. The primary goal of this research is to obtain optimal or near-optimal joint production/maintenance control policies, by means of a novel reinforcement learning-based approach. Furthermore, we examine parametric production and maintenance policies that are often used in practical situations, namely, Kanban, (s, S), threshold-type condition based maintenance and periodic maintenance. The proposed approach is compared with the parametric policies in an extensive series of simulation experiments and it is found to clearly outperform them in all cases. Based on the numerical results obtained by the experiments, the behavior of the parametric policies as well as the structure of the control policies derived by the Reinforcement Learning-based approach is investigated.
Keywords