Enhancing left ventricular segmentation in echocardiography with a modified mixed attention mechanism in SegFormer architecture

Hanqiong Wu; Gangrong Qu; Zhifeng Xiao; Fan Chunyu

Heliyon (Aug 2024)

Enhancing left ventricular segmentation in echocardiography with a modified mixed attention mechanism in SegFormer architecture

Hanqiong Wu,
Gangrong Qu,
Zhifeng Xiao,
Fan Chunyu

Affiliations

Hanqiong Wu: Internal Medicine, The First Hospital of Jinzhou Medical University, Jinzhou, 121001, China
Gangrong Qu: Cardiovascular Medicine, Chongqing General Hospital of the Armed Police Force, Chongqing, 400061, China
Zhifeng Xiao: China Nanhu Academy of Electronics and Information Technology, Jiaxing, 314050, China
Fan Chunyu: Department of Cardiovascular Medicine, The People's Hospital of Liaoning Province, Shengyang, 110067, China; Corresponding author.

Journal volume & issue: Vol. 10, no. 15
p. e34845

Abstract

Read online

Echocardiography is a key tool for the diagnosis of cardiac diseases, and accurate left ventricular (LV) segmentation in echocardiographic videos is crucial for the assessment of cardiac function. However, since semantic segmentation of video needs to take into account the temporal correlation between frames, this makes the task very challenging. This article introduces an innovative method that incorporates a modified mixed attention mechanism into the SegFormer architecture, enabling it to effectively grasp the temporal correlation present in video data. The proposed method processes each time series by encoding the image input into the encoder to obtain the current time feature map. This map, along with the historical time feature map, is then fed into a time-sensitive mixed attention mechanism type of convolution block attention module (TCBAM). Its output can serve as the historical time feature map for the subsequent sequence, and a combination of the current time feature map and historical time feature map for the current sequence. The processed feature map is then input into the Multilayer Perceptron (MLP) and subsequent networks to generate the final segmented image. Through extensive experiments conducted on two different datasets: Hamad Medical Corporation, Tampere University, and Qatar University (HMC-QU), Cardiac Acquisitions for Multi-structure Ultrasound Segmentation (CAMUS) and Sunnybrook Cardiac Data (SCD), achieving a Dice coefficient of 97.92 % on the SCD dataset and an F1 score of 0.9263 on the CAMUS dataset, outperforming all other models. This research provides a promising solution to the temporal modeling challenge in video semantic segmentation tasks using transformer-based models and points out a promising direction for future research in this field.

Published in Heliyon

ISSN: 2405-8440 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Science (General); Social Sciences: Social sciences (General)
Website: https://www.cell.com/heliyon/home

About the journal

Abstract

Keywords