Artificial Intelligence in the Life Sciences (Dec 2022)

Modeling bioconcentration factors in fish with explainable deep learning

  • Linlin Zhao,
  • Floriane Montanari,
  • Henry Heberle,
  • Sebastian Schmidt

Journal volume & issue
Vol. 2
p. 100047

Abstract

Read online

The Bioconcentration Factor (BCF) is an important parameter in the environmental risk assessment of chemicals, relevant for industrial and academic research as well as required in many regulatory contexts. It represents the potential of a substance to accumulate in organic tissues or whole animals and is most frequently measured in fish. However, animal welfare reasons, throughput limitations, and costs push the need for alternative methods that allow accurate and reliable estimations of BCF in silico. We present a new deep learning model to predict BCF values from chemical structures, that outperforms currently available models (R2 of 0.68 and RMSE of 0.59 log units on an external test set; R2 of 0.70 and RMSE of 0.74 log units in a demanding cluster split validation). The model is based on molecular representations encoded as CDDD descriptors and exploits a large in-house dataset with measured logD values as an auxiliary task.Additionally, we developed a post-hoc explainability method based on SMILES character substitutions to accompany our predictions with atom-level interpretations. These sensitivity scores highlight the most influential moieties in the molecule and can help to understand the predictions better and design new molecules.

Keywords