Environment International (Jun 2024)

Building species trait-specific nano-QSARs: Model stacking, navigating model uncertainties and limitations, and the effect of dataset size

  • Surendra Balraadjsing,
  • Willie J.G.M. Peijnenburg,
  • Martina G. Vijver

Journal volume & issue
Vol. 188
p. 108764

Abstract

Read online

A strong need exists for broadly applicable nano-QSARs, capable of predicting toxicological outcomes towards untested species and nanomaterials, under different environmental conditions. Existing nano-QSARs are generally limited to only a few species but the inclusion of species characteristics into models can aid in making them applicable to multiple species, even when toxicity data is not available for biological species. Species traits were used to create classification- and regression machine learning models to predict acute toxicity towards aquatic species for metallic nanomaterials. Afterwards, the individual classification- and regression models were stacked into a meta-model to improve performance. Additionally, the uncertainty and limitations of the models were assessed in detail (beyond the OECD principles) and it was investigated whether models would benefit from the addition of more data. Results showed a significant improvement in model performance following model stacking. Investigation of model uncertainties and limitations highlighted the discrepancy between the applicability domain and accuracy of predictions. Data points outside of the assessed chemical space did not have higher likelihoods of generating inadequate predictions or vice versa. It is therefore concluded that the applicability domain does not give complete insight into the uncertainty of predictions and instead the generation of prediction intervals can help in this regard. Furthermore, results indicated that an increase of the dataset size did not improve model performance. This implies that larger dataset sizes may not necessarily improve model performance while in turn also meaning that large datasets are not necessarily required for prediction of acute toxicity with nano-QSARs.

Keywords