Digital Chemical Engineering (Sep 2023)
A framework for enhancing industrial soft sensor learning models
Abstract
Refinery industrial processes are very complex with nonlinear dynamics resulting from varying feedstock characteristics and also from changes in product prioritization. Along these processes, there are key properties of intermediate compounds that must be monitored and controlled since they directly affect the quality of the end products commercialized by these manufacturers. However, most of these properties can only be measured through time-consuming and expensive laboratory analysis, which is impossible to obtain in high frequencies, as required to properly monitor them. In this sense, developing soft sensors is the most common way to obtain high-frequency estimations for these measurements, helping advanced control systems to establish the correct setpoints for temperatures, pressures, and other sensors along the refining process, controlling the quality of end products. Since the amount of labeled data is scarce, most academic research has focused on employing semi- supervised learning strategies to develop machine learning (ML) models as soft sensors. Our research, on the other hand, goes in another direction. We aim to elaborate a framework that leverages the knowledge of domain experts and employs data augmentation techniques to build an enhanced fully labeled dataset that could be fed to any supervised ML algorithm to generate a quality soft sensor. We applied our framework together with Automated ML to train a model capable of predicting a specific key property associated with the production of Naphtha compounds in a refinery: the ASTM 95% distillation temperature of the Heavy Naphtha. Although our framework is model agnostic, we opted by using Automated ML for the optimization strategy, since it applies a diverse set of models to the dataset, reducing the bias of utilizing a single optimization algorithm. We evaluated the proposed framework on a case study carried out in an industrial refinery in Brazil, where the previous model in production for estimating the ASTM 95% distillation temperature of the Heavy Naphtha was based entirely on the physicochemical knowledge of the process. By adopting our framework with Automated ML, we were capable of improving the R2 score by 120%. The resulting ML model is currently operating in real-time inside the refinery, leading to significant economic gains.