On the robustness of generalization of drug–drug interaction models

Rogia Kpanou; Mazid Abiodoun Osseni; Prudencio Tossou; Francois Laviolette; Jacques Corbeil

doi:10.1186/s12859-021-04398-9

BMC Bioinformatics (Oct 2021)

On the robustness of generalization of drug–drug interaction models

Rogia Kpanou,
Mazid Abiodoun Osseni,
Prudencio Tossou,
Francois Laviolette,
Jacques Corbeil

Affiliations

Rogia Kpanou: Computer Science and Software Engineering, Université Laval
Mazid Abiodoun Osseni: Computer Science and Software Engineering, Université Laval
Prudencio Tossou: Computer Science and Software Engineering, Université Laval
Francois Laviolette: Computer Science and Software Engineering, Université Laval
Jacques Corbeil: Department of Molecular Medicine, Université Laval

DOI: https://doi.org/10.1186/s12859-021-04398-9
Journal volume & issue: Vol. 22, no. 1
pp. 1 – 21

Abstract

Read online

Abstract Background Deep learning methods are a proven commodity in many fields and endeavors. One of these endeavors is predicting the presence of adverse drug–drug interactions (DDIs). The models generated can predict, with reasonable accuracy, the phenotypes arising from the drug interactions using their molecular structures. Nevertheless, this task requires improvement to be truly useful. Given the complexity of the predictive task, an extensive benchmarking on structure-based models for DDIs prediction was performed to evaluate their drawbacks and advantages. Results We rigorously tested various structure-based models that predict drug interactions using different splitting strategies to simulate different real-world scenarios. In addition to the effects of different training and testing setups on the robustness and generalizability of the models, we then explore the contribution of traditional approaches such as multitask learning and data augmentation. Conclusion Structure-based models tend to generalize poorly to unseen drugs despite their ability to identify new DDIs among drugs seen during training accurately. Indeed, they efficiently propagate information between known drugs and could be valuable for discovering new DDIs in a database. However, these models will most probably fail when exposed to unknown drugs. While multitask learning does not help in our case to solve the problem, the use of data augmentation does at least mitigate it. Therefore, researchers must be cautious of the bias of the random evaluation scheme, especially if their goal is to discover new DDIs.

Published in BMC Bioinformatics

ISSN: 1471-2105 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Biology (General)
Website: http://www.biomedcentral.com/bmcbioinformatics/

About the journal

Abstract

Keywords