Nature Communications (Sep 2023)

Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis

  • Nicole Deflaux,
  • Margaret Sunitha Selvaraj,
  • Henry Robert Condon,
  • Kelsey Mayo,
  • Sara Haidermota,
  • Melissa A. Basford,
  • Chris Lunt,
  • Anthony A. Philippakis,
  • Dan M. Roden,
  • Joshua C. Denny,
  • Anjene Musick,
  • Rory Collins,
  • Naomi Allen,
  • Mark Effingham,
  • David Glazer,
  • Pradeep Natarajan,
  • Alexander G. Bick

DOI
https://doi.org/10.1038/s41467-023-41185-x
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Recently, large scale genomic projects such as All of Us and the UK Biobank have introduced a new research paradigm where data are stored centrally in cloud-based Trusted Research Environments (TREs). To characterize the advantages and drawbacks of different TRE attributes in facilitating cross-cohort analysis, we conduct a Genome-Wide Association Study of standard lipid measures using two approaches: meta-analysis and pooled analysis. Comparison of full summary data from both approaches with an external study shows strong correlation of known loci with lipid levels (R2 ~ 83–97%). Importantly, 90 variants meet the significance threshold only in the meta-analysis and 64 variants are significant only in pooled analysis, with approximately 20% of variants in each of those groups being most prevalent in non-European, non-Asian ancestry individuals. These findings have important implications, as technical and policy choices lead to cross-cohort analyses generating similar, but not identical results, particularly for non-European ancestral populations.