Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model

Jin-Woo Kong; Byoung-Doo Oh; Chulho Kim; Yu-Seop Kim

doi:10.3390/app14031193

Applied Sciences (Jan 2024)

Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model

Jin-Woo Kong,
Byoung-Doo Oh,
Chulho Kim,
Yu-Seop Kim

Affiliations

Jin-Woo Kong: Department of Convergence Software, Hallym University, Chuncheon-si 24252, Gangwon-do, Republic of Korea
Byoung-Doo Oh: Cerebrovascular Disease Research Center, Hallym University, Chuncheon-si 24252, Gangwon-do, Republic of Korea
Chulho Kim: Department of Neurology, Chuncheon Sacred Heart Hospital, Chuncheon-si 24253, Gangwon-do, Republic of Korea
Yu-Seop Kim: Department of Convergence Software, Hallym University, Chuncheon-si 24252, Gangwon-do, Republic of Korea

DOI: https://doi.org/10.3390/app14031193
Journal volume & issue: Vol. 14, no. 3
p. 1193

Abstract

Read online

Intracerebral hemorrhage (ICH) is a severe cerebrovascular disorder that poses a life-threatening risk, necessitating swift diagnosis and treatment. While CT scans are the most effective diagnostic tool for detecting cerebral hemorrhage, their interpretation typically requires the expertise of skilled professionals. However, in regions with a shortage of such experts or situations with time constraints, delays in diagnosis may occur. In this paper, we propose a method that combines a pre-trained CNN classifier and GPT-2 to generate text for sequentially acquired ICH CT images. Initially, CNN undergoes fine-tuning by learning the presence of ICH in publicly available single CT images, and subsequently, it extracts feature vectors (i.e., matrix) from 3D ICH CT images. These vectors are input along with text into GPT-2, which is trained to generate text for consecutive CT images. In experiments, we evaluated the performance of four models to determine the most suitable image captioning model: (1) In the N-gram-based method, ReseNet50V2 and DenseNet121 showed relatively high scores. (2) In the embedding-based method, DenseNet121 exhibited the best performance. (3) Overall, the models showed good performance in BERT score. Our proposed method presents an automatic and valuable approach for analyzing 3D ICH CT images, contributing to the efficiency of ICH diagnosis and treatment.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords