NeuroImage (Sep 2021)

Selecting software pipelines for change in flortaucipir SUVR: Balancing repeatability and group separation

  • Christopher G. Schwarz,
  • Terry M. Therneau,
  • Stephen D. Weigand,
  • Jeffrey L. Gunter,
  • Val J. Lowe,
  • Scott A. Przybelski,
  • Matthew L. Senjem,
  • Hugo Botha,
  • Prashanthi Vemuri,
  • Kejal Kantarci,
  • Bradley F. Boeve,
  • Jennifer L. Whitwell,
  • Keith A. Josephs,
  • Ronald C. Petersen,
  • David S. Knopman,
  • Clifford R. Jack, Jr

Journal volume & issue
Vol. 238
p. 118259

Abstract

Read online

Since tau PET tracers were introduced, investigators have quantified them using a wide variety of automated methods. As longitudinal cohort studies acquire second and third time points of serial within-person tau PET data, determining the best pipeline to measure change has become crucial. We compared a total of 415 different quantification methods (each a combination of multiple options) according to their effects on a) differences in annual SUVR change between clinical groups, and b) longitudinal measurement repeatability as measured by the error term from a linear mixed-effects model. Our comparisons used MRI and Flortaucipir scans of 97 Mayo Clinic study participants who clinically either: a) were cognitively unimpaired, or b) had cognitive impairments that were consistent with Alzheimer's disease pathology. Tested methods included cross-sectional and longitudinal variants of two overarching pipelines (FreeSurfer 6.0, and an in-house pipeline based on SPM12), three choices of target region (entorhinal, inferior temporal, and a temporal lobe meta-ROI), five types of partial volume correction (PVC) (none, two-compartment, three-compartment, geometric transfer matrix (GTM), and a tau-specific GTM variant), seven choices of reference region (cerebellar crus, cerebellar gray matter, whole cerebellum, pons, supratentorial white matter, eroded supratentorial WM, and a composite of eroded supratentorial WM, pons, and whole cerebellum), two choices of region masking (GM or GM and WM), and two choices of statistic (voxel-wise mean vs. median). Our strongest findings were: 1) larger temporal-lobe target regions greatly outperformed entorhinal cortex (median sample size estimates based on a hypothetical clinical trial were 520–526 vs. 1740); 2) longitudinal processing pipelines outperformed cross-sectional pipelines (median sample size estimates were 483 vs. 572); and 3) reference regions including supratentorial WM outperformed traditional cerebellar and pontine options (median sample size estimates were 370 vs. 559). Altogether, our results favored longitudinally SUVR methods and a temporal-lobe meta-ROI that includes adjacent (juxtacortical) WM, a composite reference region (eroded supratentorial WM + pons + whole cerebellum), 2-class voxel-based PVC, and median statistics.

Keywords