Machine Learning and Knowledge Extraction (Jul 2024)

Insights from Augmented Data Integration and Strong Regularization in Drug Synergy Prediction with SynerGNet

  • Mengmeng Liu,
  • Gopal Srivastava,
  • J. Ramanujam,
  • Michal Brylinski

DOI
https://doi.org/10.3390/make6030087
Journal volume & issue
Vol. 6, no. 3
pp. 1782 – 1797

Abstract

Read online

SynerGNet is a novel approach to predicting drug synergy against cancer cell lines. In this study, we discuss in detail the construction process of SynerGNet, emphasizing its comprehensive design tailored to handle complex data patterns. Additionally, we investigate a counterintuitive phenomenon when integrating more augmented data into the training set results in an increase in testing loss alongside improved predictive accuracy. This sheds light on the nuanced dynamics of model learning. Further, we demonstrate the effectiveness of strong regularization techniques in mitigating overfitting, ensuring the robustness and generalization ability of SynerGNet. Finally, the continuous performance enhancements achieved through the integration of augmented data are highlighted. By gradually increasing the amount of augmented data in the training set, we observe substantial improvements in model performance. For instance, compared to models trained exclusively on the original data, the integration of the augmented data can lead to a 5.5% increase in the balanced accuracy and a 7.8% decrease in the false positive rate. Through rigorous benchmarks and analyses, our study contributes valuable insights into the development and optimization of predictive models in biomedical research.

Keywords