BMC Medical Genomics (Mar 2017)

Revealing cancer subtypes with higher-order correlations applied to imaging and omics data

  • Kiley Graim,
  • Tiffany Ting Liu,
  • Achal S. Achrol,
  • Evan O. Paull,
  • Yulia Newton,
  • Steven D. Chang,
  • Griffith R. Harsh,
  • Sergio P. Cordero,
  • Daniel L. Rubin,
  • Joshua M. Stuart

DOI
https://doi.org/10.1186/s12920-017-0256-3
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background Patient stratification to identify subtypes with different disease manifestations, severity, and expected survival time is a critical task in cancer diagnosis and treatment. While stratification approaches using various biomarkers (including high-throughput gene expression measurements) for patient-to-patient comparisons have been successful in elucidating previously unseen subtypes, there remains an untapped potential of incorporating various genotypic and phenotypic data to discover novel or improved groupings. Methods Here, we present HOCUS, a unified analytical framework for patient stratification that uses a community detection technique to extract subtypes out of sparse patient measurements. HOCUS constructs a patient-to-patient network from similarities in the data and iteratively groups and reconstructs the network into higher order clusters. We investigate the merits of using higher-order correlations to cluster samples of cancer patients in terms of their associations with survival outcomes. Results In an initial test of the method, the approach identifies cancer subtypes in mutation data of glioblastoma, ovarian, breast, prostate, and bladder cancers. In several cases, HOCUS provides an improvement over using the molecular features directly to compare samples. Application of HOCUS to glioblastoma images reveals a size and location classification of tumors that improves over human expert-based stratification. Conclusions Subtypes based on higher order features can reveal comparable or distinct groupings. The distinct solutions can provide biologically- and treatment-relevant solutions that are just as significant as solutions based on the original data.

Keywords