Informatics in Medicine Unlocked (Jan 2020)
Comorbidity network analysis and genetics of colorectal cancer
Abstract
Background: Colorectal cancer (CRC) is the third most common cancer in the United States and the second leading cause of cancer death. The goal was to identify comorbidities and genes associated with CRC. Methods: A novel social network model was developed on the Healthcare Cost and Utilization Project (HCUP) - State Inpatient Databases (SID) California database to study comorbidities of CRC. Ranked lists of comorbidities and comorbidity networks were created, and the prevalence of comorbidities in different stages of CRC was calculated. Ranked lists of comorbidities were utilized for text mining of PubMed and DisGeNET to extract genes associated with CRC. Results: 5,786 comorbidities were identified in females and 5,607 in males in early stages and 5,609 comorbidities in females and 5,427 in males in advanced stages of CRC. Associations between 1,937 different genes and CRC were extracted from PubMed. 150 genes are associated with CRC in DisGeNET. The most mentioned genes associated with CRC were: TP53 (241 abstracts in PubMed), APC (115), and KRAS (106). These 3 genes as well as MLH1 (98) and TGFBR2 (18) had DisGeNET scores of 0.5. PPARG gene (43) had DisGeNET score of 0.6. Conclusions: The results of comorbidity network analyses suggest which comorbidities of CRC are highly expected. Discovered genes could be used to recruit more individuals who would benefit from genetic consultations. Identified associations between comorbidities, CRC, and shared genes can have important implications on early discovery, and prognosis of CRC. Prevention and treatment of discovered comorbidities would potentially lead to improved quality of life and better outcomes of CRC.