Case Studies in Thermal Engineering (Sep 2023)
Development of a novel machine learning approach to optimize important parameters for improving the solubility of an anti-cancer drug within green chemistry solvent
Abstract
Understanding the solubility of drug particles in solvents has remained a big challenge in different fields. Development of advanced computational methods to predict the solubility of drugs is an important necessity due to the difficulty and time-consuming characteristics of experimental measurements. This study used hybrid machine learning (ML) models utilizing two inputs, including Pressure (X1) and Temperature (X2), to investigate the current data, and correlate the solubility of drug particles in supercritical solvent. The methods of Random Forest (RF), Extra Trees (ET), and Gradient Boosting (GB) regression models were used to build the models on the available data. RF, ET, and GB have R-squared of 0.857, 0.998, and 0.992, based on the analysis results. Additionally, in terms of MAE, they illustrated error value 2.90E-06, 1.98E-06, and 1.10E-06, respectively. One more metric to consider is MAPE, in which the error rates for the three regions were 3.15E-01, 2.27E-01, and 1.16E-01, respectively. The DT method was chosen as the best method, and can be used to find optimal amounts, which is summarized as a vector: (×1 = 383, X2 = 333.15, Y = 6.004e-05).