A New Generative Model for Textual Descriptions of Medical Images Using Transformers Enhanced with Convolutional Neural Networks

Artur Gomes Barreto; Juliana Martins de Oliveira; Francisco Nauber Bernardo Gois; Paulo Cesar Cortez; Victor Hugo Costa de Albuquerque

doi:10.3390/bioengineering10091098

Bioengineering (Sep 2023)

A New Generative Model for Textual Descriptions of Medical Images Using Transformers Enhanced with Convolutional Neural Networks

Artur Gomes Barreto,
Juliana Martins de Oliveira,
Francisco Nauber Bernardo Gois,
Paulo Cesar Cortez,
Victor Hugo Costa de Albuquerque

Affiliations

Artur Gomes Barreto: Graduate Program in Electrical Engineering, Federal University of Ceará, Fortaleza 60455-760, Brazil
Juliana Martins de Oliveira: Graduate Program in Teleinformatics Engineering, Federal University of Ceará, Fortaleza 60455-970, Brazil
Francisco Nauber Bernardo Gois: Controladoria e Ouvidoria Geral do Estado, Governo do Estado do Ceará, Fortaleza 60822-325, Brazil
Paulo Cesar Cortez: Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza 60455-970, Brazil
Victor Hugo Costa de Albuquerque: Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza 60455-970, Brazil

DOI: https://doi.org/10.3390/bioengineering10091098
Journal volume & issue: Vol. 10, no. 9
p. 1098

Abstract

Read online

The automatic generation of descriptions for medical images has sparked increasing interest in the healthcare field due to its potential to assist professionals in the interpretation and analysis of clinical exams. This study explores the development and evaluation of a generalist generative model for medical images. Gaps were identified in the literature, such as the lack of studies that explore the performance of specific models for medical description generation and the need for objective evaluation of the quality of generated descriptions. Additionally, there is a lack of model generalization to different image modalities and medical conditions. To address these issues, a methodological strategy was adopted, combining natural language processing and features extraction from medical images and feeding them into a generative model based on neural networks. The goal was to achieve model generalization across various image modalities and medical conditions. The results showed promising outcomes in the generation of descriptions, with an accuracy of 0.7628 and a BLEU-1 score of 0.5387. However, the quality of the generated descriptions may still be limited, exhibiting semantic errors or lacking relevant details. These limitations could be attributed to the availability and representativeness of the data, as well as the techniques used.

Published in Bioengineering

ISSN: 2306-5354 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology; Science: Biology (General)
Website: https://www.mdpi.com/journal/bioengineering

About the journal

Abstract

Keywords