Biology Direct (Apr 2019)

Robust pathway-based multi-omics data integration using directed random walks for survival prediction in multiple cancer studies

  • So Yeon Kim,
  • Hyun-Hwan Jeong,
  • Jaesik Kim,
  • Jeong-Hyeon Moon,
  • Kyung-Ah Sohn

DOI
https://doi.org/10.1186/s13062-019-0239-8
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Integrating the rich information from multi-omics data has been a popular approach to survival prediction and bio-marker identification for several cancer studies. To facilitate the integrative analysis of multiple genomic profiles, several studies have suggested utilizing pathway information rather than using individual genomic profiles. Methods We have recently proposed an integrative directed random walk-based method utilizing pathway information (iDRW) for more robust and effective genomic feature extraction. In this study, we applied iDRW to multiple genomic profiles for two different cancers, and designed a directed gene-gene graph which reflects the interaction between gene expression and copy number data. In the experiments, the performances of the iDRW method and four state-of-the-art pathway-based methods were compared using a survival prediction model which classifies samples into two survival groups. Results The results show that the integrative analysis guided by pathway information not only improves prediction performance, but also provides better biological insights into the top pathways and genes prioritized by the model in both the neuroblastoma and the breast cancer datasets. The pathways and genes selected by the iDRW method were shown to be related to the corresponding cancers. Conclusions In this study, we demonstrated the effectiveness of a directed random walk-based multi-omics data integration method applied to gene expression and copy number data for both breast cancer and neuroblastoma datasets. We revamped a directed gene-gene graph considering the impact of copy number variation on gene expression and redefined the weight initialization and gene-scoring method. The benchmark result for iDRW with four pathway-based methods demonstrated that the iDRW method improved survival prediction performance and jointly identified cancer-related pathways and genes for two different cancer datasets. Reviewers This article was reviewed by Helena Molina-Abril and Marta Hidalgo.

Keywords