ViT-FRD: A Vision Transformer Model for Cardiac MRI Image Segmentation Based on Feature Recombination Distillation

Chunyu Fan; Qi Su; Zhifeng Xiao; Hao Su; Aijie Hou; Bo Luan

doi:10.1109/ACCESS.2023.3302522

IEEE Access (Jan 2023)

ViT-FRD: A Vision Transformer Model for Cardiac MRI Image Segmentation Based on Feature Recombination Distillation

Chunyu Fan,
Qi Su,
Zhifeng Xiao,
Hao Su,
Aijie Hou,
Bo Luan

Affiliations

Chunyu Fan: ORCiD; Department of Cardiovascular Medicine, The People's Hospital of Liaoning Province, Shengyang, China
Qi Su: Department of Cardiovascular Medicine, The People's Hospital of Liaoning Province, Shengyang, China
Zhifeng Xiao: ORCiD; School of Engineering, Penn State Erie, The Behrend College, Erie, PA, USA
Hao Su: Department of Cardiovascular Medicine, The People's Hospital of Liaoning Province, Shengyang, China
Aijie Hou: Department of Cardiovascular Medicine, The People's Hospital of Liaoning Province, Shengyang, China
Bo Luan: Department of Cardiovascular Medicine, The People's Hospital of Liaoning Province, Shengyang, China

DOI: https://doi.org/10.1109/ACCESS.2023.3302522
Journal volume & issue: Vol. 11
pp. 129763 – 129772

Abstract

Read online

Cardiac magnetic resonance imaging analysis has been a useful tool in screening patients for heart disease. Early, timely and accurate diagnosis of diseases of the heart series is the key to effective treatment. MRI provides important material for the diagnosis of cardiac diseases. The rise of deep learning has transformed computer-aided diagnostic systems, especially in the field of medical imaging. Existing work on cardiac structure segmentation models based on MRI imaging mainly relies on convolutional neural networks (CNNs), which lack model diversity and limit the prediction performance. This paper introduces Visual Transformer with Feature Recombination and Feature Distillation(ViT-FRD), a novel learning pipeline that combines a visual transformer (ViT) and a CNN through knowledge refinement. The training procedure allows the student model, i.e., ViT, to learn from the teacher model, i.e., CNN, by optimizing distillation losses. Meanwhile, ViT-FRD provides two performance boosters to increase the efficacy and efficiency of training. The proposed method is validated on two cardiac MRI image datasets. The findings demonstrate that ViT-FRD achieves SOTA and outperforms the widely used baseline model.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords