PreRadE: Pretraining Tasks on Radiology Images and Reports Evaluation Framework

Matthew Coleman; Joanna F. Dipnall; Myong Chol Jung; Lan Du

doi:10.3390/math10244661

Mathematics (Dec 2022)

PreRadE: Pretraining Tasks on Radiology Images and Reports Evaluation Framework

Matthew Coleman,
Joanna F. Dipnall,
Myong Chol Jung,
Lan Du

Affiliations

Matthew Coleman: Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
Joanna F. Dipnall: School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC 38004, Australia
Myong Chol Jung: Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
Lan Du: Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia

DOI: https://doi.org/10.3390/math10244661
Journal volume & issue: Vol. 10, no. 24
p. 4661

Abstract

Read online

Recently, self-supervised pretraining of transformers has gained considerable attention in analyzing electronic medical records. However, systematic evaluation of different pretraining tasks in radiology applications using both images and radiology reports is still lacking. We propose PreRadE, a simple proof of concept framework that enables novel evaluation of pretraining tasks in a controlled environment. We investigated three most-commonly used pretraining tasks (MLM—Masked Language Modelling, MFR—Masked Feature Regression, and ITM—Image to Text Matching) and their combinations against downstream radiology classification on MIMIC-CXR, a medical chest X-ray imaging and radiology text report dataset. Our experiments in the multimodal setting show that (1) pretraining with MLM yields the greatest benefit to classification performance, largely due to the task-relevant information learned from the radiology reports. (2) Pretraining with only a single task can introduce variation in classification performance across different fine-tuning episodes, suggesting that composite task objectives incorporating both image and text modalities are better suited to generating reliably performant models.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords