BMC Bioinformatics (Jan 2019)
Identifying genes with tri-modal association with survival and tumor grade in cancer patients
Abstract
Abstract Background Previous cancer genomics studies focused on searching for novel oncogenes and tumor suppressor genes whose abundance is positively or negatively correlated with end-point observation, such as survival or tumor grade. This approach may potentially miss some truly functional genes if both its low and high modes have associations with end-point observation. Such genes act as both oncogenes and tumor suppressor genes, a scenario that is unlikely but theoretically possible. Results We invented an Expectation-Maximization (EM) algorithm to divide patients into low-, middle- and high-expressing groups according to the expression level of a certain gene in both tumor and normal patients. We found one gene, ORMDL3, whose low and high modes were both associated with worse survival and higher tumor grade in breast cancer patients in multiple patient cohorts. We speculate that its tumor suppressor gene role may be real, while its high expression correlating with worse end-point outcome is probably due to the passenger event of the nearby ERBB2’s amplification. Conclusions The proposed EM algorithm can effectively detect genes having tri-modal distributed expression in patient groups compared to normal genes, thus rendering a new perspective on dissecting the association between genomic features and end-point observations. Our analysis of breast cancer datasets suggest that the gene ORMDL3 may have an unexploited tumor suppressive function.
Keywords