Applied Sciences (Aug 2023)

Convolutional Neural Network and Language Model-Based Sequential CT Image Captioning for Intracerebral Hemorrhage

  • Gi-Youn Kim,
  • Byoung-Doo Oh,
  • Chulho Kim,
  • Yu-Seop Kim

DOI
https://doi.org/10.3390/app13179665
Journal volume & issue
Vol. 13, no. 17
p. 9665

Abstract

Read online

Intracerebral hemorrhage is a severe problem where more than one-third of patients die within a month. In diagnosing intracranial hemorrhage, neuroimaging examinations are essential. As a result, the interpretation of neuroimaging becomes a crucial process in medical procedures. However, human-based image interpretation has inherent limitations, as it can only handle a restricted range of tasks. To address this, a study on medical image captioning has been conducted, but it primarily focused on single medical images. However, actual medical images often consist of continuous sequences, such as CT scans, making it challenging to directly apply existing studies. Therefore, this paper proposes a CT image captioning model that utilizes a 3D-CNN model and distilGPT-2. In this study, four combinations of 3D-CNN models and language models were compared and analyzed for their performance. Additionally, the impact of applying penalties to the loss function and adjusting penalty values during the training process was examined. The proposed CT image captioning model demonstrated a maximum BLEU score of 0.35 on the in-house dataset, and it was observed that the text generated by the model became more similar to human interpretations in medical image reports with the application of loss function penalties.

Keywords