Quality assurance of late gadolinium enhancement cardiac magnetic resonance images: a deep learning classifier for confidence in the presence or absence of abnormality with potential to prompt real-time image optimization

Sameer Zaman; Kavitha Vimalesvaran; Digby Chappell; Marta Varela; Nicholas S. Peters; Hunain Shiwani; Kristopher D. Knott; Rhodri H. Davies; James C. Moon; Anil A. Bharath; Nick WF Linton; Darrel P. Francis; Graham D. Cole; James P. Howard

Journal of Cardiovascular Magnetic Resonance (Jan 2024)

Quality assurance of late gadolinium enhancement cardiac magnetic resonance images: a deep learning classifier for confidence in the presence or absence of abnormality with potential to prompt real-time image optimization

Sameer Zaman,
Kavitha Vimalesvaran,
Digby Chappell,
Marta Varela,
Nicholas S. Peters,
Hunain Shiwani,
Kristopher D. Knott,
Rhodri H. Davies,
James C. Moon,
Anil A. Bharath,
Nick WF Linton,
Darrel P. Francis,
Graham D. Cole,
James P. Howard

Affiliations

Sameer Zaman: National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK; Imperial College Healthcare NHS Trust, London W12 0HS, UK; AI for Healthcare Centre for Doctoral Training, Imperial College London, London SW7 2AZ, UK
Kavitha Vimalesvaran: National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK; AI for Healthcare Centre for Doctoral Training, Imperial College London, London SW7 2AZ, UK
Digby Chappell: AI for Healthcare Centre for Doctoral Training, Imperial College London, London SW7 2AZ, UK
Marta Varela: National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK
Nicholas S. Peters: National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK; Imperial College Healthcare NHS Trust, London W12 0HS, UK
Hunain Shiwani: Institute of Cardiovascular Science, University College London, London WC1E 6DD, UK; Barts Health Centre, St. Bartholomew’s Hospital, London EC1A 7BE, UK
Kristopher D. Knott: Institute of Cardiovascular Science, University College London, London WC1E 6DD, UK; St. George’s University Hospitals NHS Foundation Trust, London SW17 0QT, UK
Rhodri H. Davies: Institute of Cardiovascular Science, University College London, London WC1E 6DD, UK; Barts Health Centre, St. Bartholomew’s Hospital, London EC1A 7BE, UK
James C. Moon: Institute of Cardiovascular Science, University College London, London WC1E 6DD, UK; Barts Health Centre, St. Bartholomew’s Hospital, London EC1A 7BE, UK
Anil A. Bharath: Department of Bioengineering, Imperial College London, London SW7 2AZ, UK
Nick WF Linton: Imperial College Healthcare NHS Trust, London W12 0HS, UK; Department of Bioengineering, Imperial College London, London SW7 2AZ, UK; Corresponding author at: Department of Bioengineering, Imperial College London, London SW7 2AZ, UK.
Darrel P. Francis: National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK; Imperial College Healthcare NHS Trust, London W12 0HS, UK
Graham D. Cole: National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK; Imperial College Healthcare NHS Trust, London W12 0HS, UK
James P. Howard: National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK; Imperial College Healthcare NHS Trust, London W12 0HS, UK

Journal volume & issue: Vol. 26, no. 1
p. 101040

Abstract

Read online

Background: Late gadolinium enhancement (LGE) of the myocardium has significant diagnostic and prognostic implications, with even small areas of enhancement being important. Distinguishing between definitely normal and definitely abnormal LGE images is usually straightforward, but diagnostic uncertainty arises when reporters are not sure whether the observed LGE is genuine or not. This uncertainty might be resolved by repetition (to remove artifact) or further acquisition of intersecting images, but this must take place before the scan finishes. Real-time quality assurance by humans is a complex task requiring training and experience, so being able to identify which images have an intermediate likelihood of LGE while the scan is ongoing, without the presence of an expert is of high value. This decision-support could prompt immediate image optimization or acquisition of supplementary images to confirm or refute the presence of genuine LGE. This could reduce ambiguity in reports. Methods: Short-axis, phase-sensitive inversion recovery late gadolinium images were extracted from our clinical cardiac magnetic resonance (CMR) database and shuffled. Two, independent, blinded experts scored each individual slice for “LGE likelihood” on a visual analog scale, from 0 (absolute certainty of no LGE) to 100 (absolute certainty of LGE), with 50 representing clinical equipoise. The scored images were split into two classes—either “high certainty” of whether LGE was present or not, or “low certainty.” The dataset was split into training, validation, and test sets (70:15:15). A deep learning binary classifier based on the EfficientNetV2 convolutional neural network architecture was trained to distinguish between these categories. Classifier performance on the test set was evaluated by calculating the accuracy, precision, recall, F1-score, and area under the receiver operating characteristics curve (ROC AUC). Performance was also evaluated on an external test set of images from a different center. Results: One thousand six hundred and forty-five images (from 272 patients) were labeled and split at the patient level into training (1151 images), validation (247 images), and test (247 images) sets for the deep learning binary classifier. Of these, 1208 images were “high certainty” (255 for LGE, 953 for no LGE), and 437 were “low certainty”. An external test comprising 247 images from 41 patients from another center was also employed. After 100 epochs, the performance on the internal test set was accuracy = 0.94, recall = 0.80, precision = 0.97, F1-score = 0.87, and ROC AUC = 0.94. The classifier also performed robustly on the external test set (accuracy = 0.91, recall = 0.73, precision = 0.93, F1-score = 0.82, and ROC AUC = 0.91). These results were benchmarked against a reference inter-expert accuracy of 0.86. Conclusion: Deep learning shows potential to automate quality control of late gadolinium imaging in CMR. The ability to identify short-axis images with intermediate LGE likelihood in real-time may serve as a useful decision-support tool. This approach has the potential to guide immediate further imaging while the patient is still in the scanner, thereby reducing the frequency of recalls and inconclusive reports due to diagnostic indecision.

Published in Journal of Cardiovascular Magnetic Resonance

ISSN: 1097-6647 (Print); 1532-429X (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Internal medicine: Specialties of internal medicine: Diseases of the circulatory (Cardiovascular) system
Website: https://www.sciencedirect.com/journal/journal-of-cardiovascular-magnetic-resonance

About the journal

Abstract

Keywords