Informatics in Medicine Unlocked (Jan 2020)

Identifying critical genes in esophageal squamous cell carcinoma using an ensemble approach

  • Pallabi Patowary,
  • Dhruba K. Bhattacharyya,
  • Pankaj Barah

Journal volume & issue
Vol. 18

Abstract

Read online

Esophageal Squamous Cell Carcinoma (ESCC) is considered as a deadly disease especially in the North-East, India. A series of differentially expressed genes (DEGs) are suspected to be involved in the progression of ESCC. To search the DEGs a good number of tools are available. To remove the biasness of resulting DEGs given by all such tools, a consensus function is necessary on which user can rely on the output generated by differential expression analysis methods applied on multiple sources of data. In this study, we have considered two microarrays (of 34 and 106 samples, respectively) and one RNA-seq data (of 29 samples) to conduct an unbiased integrative analysis towards the identification of critical genes for ESCC. Initially, independent downstream analysis on each type of data using six differentially expressed gene identification tools followed by an integrative analysis supported by an effective consensus function is conducted to identify an unbiased set of differently expressed genes. The identified gene set includes common genes obtained (for P-value cut-off < 0.01) from the tools as well as some uncommon top-ranked genes (for P-value cut-off < 0.001). Next, further preservation analysis is performed and identified a set of low preserved modules. Finally, hub genes are identified from the selected low preserved modules and validated both topologically and biologically. A set of hub genes are identified such as SOX11, COL27A1, TOP3A, BAG6, CDC6, EZH2, COL7A1, G6PD, and AKR1C2 which have been established to be critical for ESCC. Keywords: ESCC, Microarray, RNA-seq, Differentially expressed gene, Module preservation, Differential expression analysis, Co-expression network