BMC Bioinformatics (May 2020)

Detecting PCOS susceptibility loci from genome-wide association studies via iterative trend correlation based feature screening

  • Xiaotian Dai,
  • Guifang Fu,
  • Randall Reese

DOI
https://doi.org/10.1186/s12859-020-3492-z
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Background Feature screening plays a critical role in handling ultrahigh dimensional data analyses when the number of features exponentially exceeds the number of observations. It is increasingly common in biomedical research to have case-control (binary) response and an extremely large-scale categorical features. However, the approach considering such data types is limited in extant literature. In this article, we propose a new feature screening approach based on the iterative trend correlation (ITC-SIS, for short) to detect important susceptibility loci that are associated with the polycystic ovary syndrome (PCOS) affection status by screening 731,442 SNP features that were collected from the genome-wide association studies. Results We prove that the trend correlation based screening approach satisfies the theoretical strong screening consistency property under a set of reasonable conditions, which provides an appealing theoretical support for its outperformance. We demonstrate that the finite sample performance of ITC-SIS is accurate and fast through various simulation designs. Conclusion ITC-SIS serves as a good alternative method to detect disease susceptibility loci for clinic genomic data.

Keywords