Communications Biology (Apr 2024)

Unsupervised deep representation learning enables phenotype discovery for genetic association studies of brain imaging

  • Khush Patel,
  • Ziqian Xie,
  • Hao Yuan,
  • Sheikh Muhammad Saiful Islam,
  • Yaochen Xie,
  • Wei He,
  • Wanheng Zhang,
  • Assaf Gottlieb,
  • Han Chen,
  • Luca Giancardo,
  • Alexander Knaack,
  • Evan Fletcher,
  • Myriam Fornage,
  • Shuiwang Ji,
  • Degui Zhi

DOI
https://doi.org/10.1038/s42003-024-06096-7
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Understanding the genetic architecture of brain structure is challenging, partly due to difficulties in designing robust, non-biased descriptors of brain morphology. Until recently, brain measures for genome-wide association studies (GWAS) consisted of traditionally expert-defined or software-derived image-derived phenotypes (IDPs) that are often based on theoretical preconceptions or computed from limited amounts of data. Here, we present an approach to derive brain imaging phenotypes using unsupervised deep representation learning. We train a 3-D convolutional autoencoder model with reconstruction loss on 6130 UK Biobank (UKBB) participants’ T1 or T2-FLAIR (T2) brain MRIs to create a 128-dimensional representation known as Unsupervised Deep learning derived Imaging Phenotypes (UDIPs). GWAS of these UDIPs in held-out UKBB subjects (n = 22,880 discovery and n = 12,359/11,265 replication cohorts for T1/T2) identified 9457 significant SNPs organized into 97 independent genetic loci of which 60 loci were replicated. Twenty-six loci were not reported in earlier T1 and T2 IDP-based UK Biobank GWAS. We developed a perturbation-based decoder interpretation approach to show that these loci are associated with UDIPs mapped to multiple relevant brain regions. Our results established unsupervised deep learning can derive robust, unbiased, heritable, and interpretable brain imaging phenotypes.