Scientific Reports (Aug 2025)
Computational machine learning estimation of digitoxin solubility in supercritical solvent at different temperatures utilizing ensemble methods
Abstract
Abstract The solubility of medications in supercritical solvent is the most important factor that can be determined via appropriate computational tools. This work explores the modeling of digitoxin solubility as the case study in supercritical CO2 and solvent density utilizing ensemble methods. Temperature and pressure are the input parameters, while solvent density and digitoxin solubility are the output parameters. Several machine learning models along with optimizer were used for correlation of the dataset. Employing AdaBoost as an ensemble method, predictions from Bayesian Ridge Regression (BRR), Gaussian process regression (GPR), and K-nearest neighbors (KNN) are amalgamated. Sailfish Optimizer (SFO) is utilized for hyper-parameter tuning to enhance model performance. Results reveal that AdaBoost combined with ADA-GPR exhibits the lowest Average Absolute Relative Deviation (AARD%) values, with solubility achieving 7.74 and solvent density reaching 2.76, respectively. This underscores the efficacy of ensemble methods and hyper-parameter tuning in accurately predicting complex chemical properties in supercritical CO2 systems.
Keywords