NeuroImage: Clinical (Jan 2017)
MRI FLAIR lesion segmentation in multiple sclerosis: Does automated segmentation hold up with manual annotation?
Abstract
Introduction: Magnetic resonance imaging (MRI) has become key in the diagnosis and disease monitoring of patients with multiple sclerosis (MS). Both, T2 lesion load and Gadolinium (Gd) enhancing T1 lesions represent important endpoints in MS clinical trials by serving as a surrogate of clinical disease activity. T2- and fluid-attenuated inversion recovery (FLAIR) lesion quantification - largely due to methodological constraints – is still being performed manually or in a semi-automated fashion, although strong efforts have been made to allow automated quantitative lesion segmentation. In 2012, Schmidt and co-workers published an algorithm to be applied on FLAIR sequences. The aim of this study was to apply the Schmidt algorithm on an independent data set and compare automated segmentation to inter-rater variability of three independent, experienced raters. Methods: MRI data of 50 patients with RRMS were randomly selected from a larger pool of MS patients attending the MS Clinic at the Brain and Mind Centre, University of Sydney, Australia. MRIs were acquired on a 3.0T GE scanner (Discovery MR750, GE Medical Systems, Milwaukee, WI) using an 8 channel head coil. We determined T2-lesion load (total lesion volume and total lesion number) using three versions of an automated segmentation algorithm (Lesion growth algorithm (LGA) based on SPM8 or SPM12 and lesion prediction algorithm (LPA) based on SPM12) as first described by Schmidt et al. (2012). Additionally, manual segmentation was performed by three independent raters. We calculated inter-rater correlation coefficients (ICC) and dice coefficients (DC) for all possible pairwise comparisons. Results: We found a strong correlation between manual and automated lesion segmentation based on LGA SPM8, regarding lesion volume (ICC = 0.958 and DC = 0.60) that was not statistically different from the inter-rater correlation (ICC = 0.97 and DC = 0.66). Correlation between the two other algorithms (LGA SPM12 and LPA SPM12) and manual raters was weaker but still adequate (ICC = 0.927 and DC = 0.53 for LGA SPM12 and ICC = 0.949 and DC = 0.57 for LPA SPM12). Variability of both manual and automated segmentation was significantly higher regarding lesion numbers. Conclusion: Automated lesion volume quantification can be applied reliably on FLAIR data sets using the SPM based algorithm of Schmidt et al. and shows good agreement with manual segmentation.
Keywords