FAILURE RATE REGRESSION MODEL BUILDING FROM AGGREGATED DATA USING KERNELBASED MACHINE LEARNING METHODS
Abstract
The problem of regression model building of equipment failure rate using datasets containing information on number of failures of recoverable systems and measurements of technological and operational factors affecting the reliability of production system is considered. This problem is important for choosing optimal strategy for preventive maintenance and restoration of elements of process equipment, which, in turn, significantly affects the efficiency of production management system. From a practical point of view, of greatest interest is the development of methods for regression models building to assess the impact of various technological and operational factors controlled during system operation on failure rate. The usual approach to regression models construction involves preselecting the model structure in the form of a parameterized functional relationship between failure rate and affecting technological variables followed by statistical estimation of unknown model parameters or training the model on datasets of measured covariates and failures.The main problem lies precisely in the choice of model structure, the complexity of which should correspond to amount of data available for training model, which in the problem of failure rate modeling is greatly complicated by lack of a priori information about its dependence on affecting variables. In this work, such a problem is solved using machine learning methods, namely, kernel ridge regression, which makes it possible to effectively approximate complex nonlinear dependences of equipment failure rate on technological factors, while there is no need to pre-select the model structure. Preliminary aggregation of data by combination of factor and cluster analysis can significantly simplify model structure. The proposed technique is illustrated by solving a practical problem of failure rate model building for semiconductor production equipment based on real data.
Keywords