A parametric bootstrap approach for computing confidence intervals for genetic correlations with application to genetically determined protein-protein networks
Yi-Ting Tsai,
Yana Hrytsenko,
Michael Elgart,
Usman A. Tahir,
Zsu-Zsu Chen,
James G. Wilson,
Robert E. Gerszten,
Tamar Sofer
Affiliations
Yi-Ting Tsai
Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Yana Hrytsenko
Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA; CardioVascular Institute (CVI), Beth Israel Deaconess Medical Center, Boston, MA, USA
Michael Elgart
Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA
Usman A. Tahir
Department of Medicine, Harvard Medical School, Boston, MA, USA; CardioVascular Institute (CVI), Beth Israel Deaconess Medical Center, Boston, MA, USA
Zsu-Zsu Chen
Department of Medicine, Harvard Medical School, Boston, MA, USA; Department of Internal Medicine, Division of Endocrinology, Diabetes, and Metabolism, Beth Israel Deaconess Medical Center, Boston, MA, USA
James G. Wilson
Department of Medicine, Harvard Medical School, Boston, MA, USA; CardioVascular Institute (CVI), Beth Israel Deaconess Medical Center, Boston, MA, USA
Robert E. Gerszten
Department of Medicine, Harvard Medical School, Boston, MA, USA; CardioVascular Institute (CVI), Beth Israel Deaconess Medical Center, Boston, MA, USA
Tamar Sofer
Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA; CardioVascular Institute (CVI), Beth Israel Deaconess Medical Center, Boston, MA, USA; Corresponding author
Summary: Genetic correlation refers to the correlation between genetic determinants of a pair of traits. When using individual-level data, it is typically estimated based on a bivariate model specification where the correlation between the two variables is identifiable and can be estimated from a covariance model that incorporates the genetic relationship between individuals, e.g., using a pre-specified kinship matrix. Inference relying on asymptotic normality of the genetic correlation parameter estimates may be inaccurate when the sample size is low, when the genetic correlation is close to the boundary of the parameter space, and when the heritability of at least one of the traits is low. We address this problem by developing a parametric bootstrap procedure to construct confidence intervals for genetic correlation estimates. The procedure simulates paired traits under a range of heritability and genetic correlation parameters, and it uses the population structure encapsulated by the kinship matrix. Heritabilities and genetic correlations are estimated using the close-form, method of moment, Haseman-Elston regression estimators. The proposed parametric bootstrap procedure is especially useful when genetic correlations are computed on pairs of thousands of traits measured on the same exact set of individuals. We demonstrate the parametric bootstrap approach on a proteomics dataset from the Jackson Heart Study.