NeuroImage (Nov 2022)

A comparison of methods to harmonize cortical thickness measurements across scanners and sites

  • Delin Sun,
  • Gopalkumar Rakesh,
  • Courtney C. Haswell,
  • Mark Logue,
  • C. Lexi Baird,
  • Erin N. O'Leary,
  • Andrew S. Cotton,
  • Hong Xie,
  • Marijo Tamburrino,
  • Tian Chen,
  • Emily L. Dennis,
  • Neda Jahanshad,
  • Lauren E. Salminen,
  • Sophia I. Thomopoulos,
  • Faisal Rashid,
  • Christopher R.K. Ching,
  • Saskia B.J. Koch,
  • Jessie L. Frijling,
  • Laura Nawijn,
  • Mirjam van Zuiden,
  • Xi Zhu,
  • Benjamin Suarez-Jimenez,
  • Anika Sierk,
  • Henrik Walter,
  • Antje Manthey,
  • Jennifer S. Stevens,
  • Negar Fani,
  • Sanne J.H. van Rooij,
  • Murray Stein,
  • Jessica Bomyea,
  • Inga K. Koerte,
  • Kyle Choi,
  • Steven J.A. van der Werff,
  • Robert R.J.M. Vermeiren,
  • Julia Herzog,
  • Lauren A.M. Lebois,
  • Justin T. Baker,
  • Elizabeth A. Olson,
  • Thomas Straube,
  • Mayuresh S. Korgaonkar,
  • Elpiniki Andrew,
  • Ye Zhu,
  • Gen Li,
  • Jonathan Ipser,
  • Anna R. Hudson,
  • Matthew Peverill,
  • Kelly Sambrook,
  • Evan Gordon,
  • Lee Baugh,
  • Gina Forster,
  • Raluca M. Simons,
  • Jeffrey S. Simons,
  • Vincent Magnotta,
  • Adi Maron-Katz,
  • Stefan du Plessis,
  • Seth G. Disner,
  • Nicholas Davenport,
  • Daniel W. Grupe,
  • Jack B. Nitschke,
  • Terri A. deRoon-Cassini,
  • Jacklynn M. Fitzgerald,
  • John H. Krystal,
  • Ifat Levy,
  • Miranda Olff,
  • Dick J. Veltman,
  • Li Wang,
  • Yuval Neria,
  • Michael D. De Bellis,
  • Tanja Jovanovic,
  • Judith K. Daniels,
  • Martha Shenton,
  • Nic J.A. van de Wee,
  • Christian Schmahl,
  • Milissa L. Kaufman,
  • Isabelle M. Rosso,
  • Scott R. Sponheim,
  • David Bernd Hofmann,
  • Richard A. Bryant,
  • Kelene A. Fercho,
  • Dan J. Stein,
  • Sven C. Mueller,
  • Bobak Hosseini,
  • K. Luan Phan,
  • Katie A. McLaughlin,
  • Richard J. Davidson,
  • Christine L. Larson,
  • Geoffrey May,
  • Steven M. Nelson,
  • Chadi G. Abdallah,
  • Hassaan Gomaa,
  • Amit Etkin,
  • Soraya Seedat,
  • Ilan Harpaz-Rotem,
  • Israel Liberzon,
  • Theo G.M. van Erp,
  • Yann Quidé,
  • Xin Wang,
  • Paul M. Thompson,
  • Rajendra A. Morey

Journal volume & issue
Vol. 261
p. 119509

Abstract

Read online

Results of neuroimaging datasets aggregated from multiple sites may be biased by site-specific profiles in participants’ demographic and clinical characteristics, as well as MRI acquisition protocols and scanning platforms. We compared the impact of four different harmonization methods on results obtained from analyses of cortical thickness data: (1) linear mixed-effects model (LME) that models site-specific random intercepts (LMEINT), (2) LME that models both site-specific random intercepts and age-related random slopes (LMEINT+SLP), (3) ComBat, and (4) ComBat with a generalized additive model (ComBat-GAM). Our test case for comparing harmonization methods was cortical thickness data aggregated from 29 sites, which included 1,340 cases with posttraumatic stress disorder (PTSD) (6.2–81.8 years old) and 2,057 trauma-exposed controls without PTSD (6.3–85.2 years old). We found that, compared to the other data harmonization methods, data processed with ComBat-GAM was more sensitive to the detection of significant case-control differences (Χ2(3) = 63.704, p < 0.001) as well as case-control differences in age-related cortical thinning (Χ2(3) = 12.082, p = 0.007). Both ComBat and ComBat-GAM outperformed LME methods in detecting sex differences (Χ2(3) = 9.114, p = 0.028) in regional cortical thickness. ComBat-GAM also led to stronger estimates of age-related declines in cortical thickness (corrected p-values < 0.001), stronger estimates of case-related cortical thickness reduction (corrected p-values < 0.001), weaker estimates of age-related declines in cortical thickness in cases than controls (corrected p-values < 0.001), stronger estimates of cortical thickness reduction in females than males (corrected p-values < 0.001), and stronger estimates of cortical thickness reduction in females relative to males in cases than controls (corrected p-values < 0.001). Our results support the use of ComBat-GAM to minimize confounds and increase statistical power when harmonizing data with non-linear effects, and the use of either ComBat or ComBat-GAM for harmonizing data with linear effects.

Keywords