HGG Advances (Jul 2023)
Multivariate extension of penalized regression on summary statistics to construct polygenic risk scores for correlated traits
Abstract
Summary: Genetic correlations between human traits and disorders such as schizophrenia (SZ) and bipolar disorder (BD) diagnoses are well established. Improved prediction of individual traits has been obtained by combining predictors of multiple genetically correlated traits derived from summary statistics produced by genome-wide association studies, compared with single trait predictors. We extend this idea to penalized regression on summary statistics in Multivariate Lassosum, expressing regression coefficients for the multiple traits on single nucleotide polymorphisms (SNPs) as correlated random effects, similarly to multi-trait summary statistic best linear unbiased predictors (MT-SBLUPs). We also allow the SNP contributions to genetic covariance and heritability to depend on genomic annotations. We conducted simulations with two dichotomous traits having polygenic architecture similar to SZ and BD, using genotypes from 29,330 subjects from the CARTaGENE cohort. Multivariate Lassosum produced polygenic risk scores (PRSs) more strongly correlated with the true genetic risk predictor and had better discrimination power between affected and non-affected subjects than previously published sparse multi-trait (PANPRS) and univariate (Lassosum, sparse LDpred2, and the standard clumping and thresholding) methods in most simulation settings. Application of Multivariate Lassosum to predict SZ, BD, and related psychiatric traits in the Eastern Quebec SZ and BD kindred study revealed associations with every trait stronger than those obtained with univariate sparse PRSs, particularly when heritability and genetic covariance depended on genomic annotations. Multivariate Lassosum thus appears promising to improve prediction of genetically correlated traits with summary statistics for a selected subset of SNPs.