BMC Bioinformatics (Aug 2023)

EDST: a decision stump based ensemble algorithm for synergistic drug combination prediction

  • Jing Chen,
  • Lianlian Wu,
  • Kunhong Liu,
  • Yong Xu,
  • Song He,
  • Xiaochen Bo

DOI
https://doi.org/10.1186/s12859-023-05453-3
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 21

Abstract

Read online

Abstract Introduction There are countless possibilities for drug combinations, which makes it expensive and time-consuming to rely solely on clinical trials to determine the effects of each possible drug combination. In order to screen out the most effective drug combinations more quickly, scholars began to apply machine learning to drug combination prediction. However, most of them are of low interpretability. Consequently, even though they can sometimes produce high prediction accuracy, experts in the medical and biological fields can still not fully rely on their judgments because of the lack of knowledge about the decision-making process. Related work Decision trees and their ensemble algorithms are considered to be suitable methods for pharmaceutical applications due to their excellent performance and good interpretability. We review existing decision trees or decision tree ensemble algorithms in the medical field and point out their shortcomings. Method This study proposes a decision stump (DS)-based solution to extract interpretable knowledge from data sets. In this method, a set of DSs is first generated to selectively form a decision tree (DST). Different from the traditional decision tree, our algorithm not only enables a partial exchange of information between base classifiers by introducing a stump exchange method but also uses a modified Gini index to evaluate stump performance so that the generation of each node is evaluated by a global view to maintain high generalization ability. Furthermore, these trees are combined to construct an ensemble of DST (EDST). Experiment The two-drug combination data sets are collected from two cell lines with three classes (additive, antagonistic and synergistic effects) to test our method. Experimental results show that both our DST and EDST perform better than other methods. Besides, the rules generated by our methods are more compact and more accurate than other rule-based algorithms. Finally, we also analyze the extracted knowledge by the model in the field of bioinformatics. Conclusion The novel decision tree ensemble model can effectively predict the effect of drug combination datasets and easily obtain the decision-making process.

Keywords