IEEE Access (Jan 2020)

ELMO: An Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes

  • Yexian Zhang,
  • Ruoyao Shi,
  • Chaorong Chen,
  • Meiyu Duan,
  • Shuai Liu,
  • Yanjiao Ren,
  • Lan Huang,
  • Xiaofeng Dai,
  • Fengfeng Zhou

DOI
https://doi.org/10.1109/ACCESS.2019.2960373
Journal volume & issue
Vol. 8
pp. 5121 – 5130

Abstract

Read online

Breast cancer is one of the most frequently occurring female cancer types and represents a major cause of death among women worldwide. Breast cancer is heterogeneous in both molecular characteristics and clinical outcomes for its different molecular subtypes. High-throughput technologies facilitated the fast accumulations of the multiple Omic data for cancer patients. These data sources posed a computational challenge for the efficient integrated multi-Omic analysis. The existing studies usually investigated the differential representation or machine learning problems using a single type of Omic data. This study hypothesized that different Omic types contributed complementary information to each other, and their integrated analysis may improve the single-Omic models. An efficient logistic regression-based multi-Omic integrated analysis method (ELMO) was proposed to integrate the RNA-seq and DNA methylation data to detect the breast cancer intrinsic subtypes. ELMO achieved the highest accuracy with a smaller number of features compared with the existing filter and wrapper feature selection methods in this study. The experimental data supported our hypothesis that multi-Omic models outperformed the single-Omic ones.

Keywords