BMC Bioinformatics (Oct 2017)
Predicting combinative drug pairs towards realistic screening via integrating heterogeneous features
Abstract
Abstract Background Drug Combination is one of the effective approaches for treating complex diseases. However, determining combinative drug pairs in clinical trials is still costly. Thus, computational approaches are used to identify potential drug pairs in advance. Existing computational approaches have the following shortcomings: (i) the lack of an effective integration of heterogeneous features leads to a time-consuming training and even results in an over-fitted classifier; and (ii) the narrow consideration of predicting potential drug combinations only among known drugs having known combinations cannot meet the demand of realistic screenings, which pay more attention to potential combinative pairs among newly-coming drugs that have no approved combination with other drugs at all. Results In this paper, to tackle the above two problems, we propose a novel drug-driven approach for predicting potential combinative pairs on a large scale. We define four new features based on heterogeneous data and design an efficient fusion scheme to integrate these feature. Moreover importantly, we elaborate appropriate cross-validations towards realistic screening scenarios of drug combinations involving both known drugs and new drugs. In addition, we perform an extra investigation to show how each kind of heterogeneous features is related to combinative drug pairs. The investigation inspires the design of our approach. Experiments on real data demonstrate the effectiveness of our fusion scheme for integrating heterogeneous features and its predicting power in three scenarios of realistic screening. In terms of both AUC and AUPR, the prediction among known drugs achieves 0.954 and 0.821, that between known drugs and new drugs achieves 0.909 and 0.635, and that among new drugs achieves 0.809 and 0.592 respectively. Conclusions Our approach provides not only an effective tool to integrate heterogeneous features but also the first tool to predict potential combinative pairs among new drugs.