Environment International (Dec 2024)
A comprehensive machine learning-based models for predicting mixture toxicity of azole fungicides toward algae (Auxenochlorella pyrenoidosa)
Abstract
Quantitative structure–activity relationships (QSARs) have been used to predict mixture toxicity. However, current research faces gaps in achieving accurate predictions of the mixture toxicity of azole fungicides. To address this gap, the application of machine learning (ML) algorithms has emerged as an effective strategy. In this study, we applied 12 algorithms, namely, k-nearest neighbor (KNN), kernel k-nearest neighbors (KKNN), support vector machine (SVM), random forest (RF), stochastic gradient boosting (GBM), cubist, bagged multivariate adaptive regression splines (Bagged MARS), eXtreme gradient boosting (XGBoost), boosted generalized linear model (GLMBoost), boosted generalized additive model (GAMBoost), bayesian regularized neural networks (BRNN), and recursive partitioning and regression trees (CART) to build ML models for 225 mixture toxicity of azole fungicides towards Auxenochlorella pyrenoidosa. A total of 36 single ML models and 12 consensus models were developed. The results indicated that models employing concentration addition (CA), independent action (IA), and molecular descriptors (MD) as variables demonstrated superior predictive abilities. The consensus model combining SVM and RF algorithms (labeled as CM0) demonstrated the highest level of accuracy in fitting the data, with a coefficient of determination of 0.980. Additionally, it showed strong predictive abilities when tested with external data, achieving an external R2 value of 0.945 and a Concordance Correlation Coefficient of 0.967. This study provides a positive contribution to the ecological risk assessment of a mixture of azole fungicides.