Hepatic vessel segmentation based on 3D swin-transformer with inductive biased multi-head self-attention

Mian Wu; Yinling Qian; Xiangyun Liao; Qiong Wang; Pheng-Ann Heng

doi:10.1186/s12880-023-01045-y

BMC Medical Imaging (Jul 2023)

Hepatic vessel segmentation based on 3D swin-transformer with inductive biased multi-head self-attention

Mian Wu,
Yinling Qian,
Xiangyun Liao,
Qiong Wang,
Pheng-Ann Heng

Affiliations

Mian Wu: Guangdong Provincial Key Laboratory of Computer Vision and Virtual Reality Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Science
Yinling Qian: Guangdong Provincial Key Laboratory of Computer Vision and Virtual Reality Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Science
Xiangyun Liao: Guangdong Provincial Key Laboratory of Computer Vision and Virtual Reality Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Science
Qiong Wang: Guangdong Provincial Key Laboratory of Computer Vision and Virtual Reality Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Science
Pheng-Ann Heng: Guangdong Provincial Key Laboratory of Computer Vision and Virtual Reality Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Science

DOI: https://doi.org/10.1186/s12880-023-01045-y
Journal volume & issue: Vol. 23, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Purpose Segmentation of liver vessels from CT images is indispensable prior to surgical planning and aroused a broad range of interest in the medical image analysis community. Due to the complex structure and low-contrast background, automatic liver vessel segmentation remains particularly challenging. Most of the related researches adopt FCN, U-net, and V-net variants as a backbone. However, these methods mainly focus on capturing multi-scale local features which may produce misclassified voxels due to the convolutional operator’s limited locality reception field. Methods We propose a robust end-to-end vessel segmentation network called Inductive BIased Multi-Head Attention Vessel Net(IBIMHAV-Net) by expanding swin transformer to 3D and employing an effective combination of convolution and self-attention. In practice, we introduce voxel-wise embedding rather than patch-wise embedding to locate precise liver vessel voxels and adopt multi-scale convolutional operators to gain local spatial information. On the other hand, we propose the inductive biased multi-head self-attention which learns inductively biased relative positional embedding from initialized absolute position embedding. Based on this, we can gain more reliable queries and key matrices. Results We conducted experiments on the 3DIRCADb dataset. The average dice and sensitivity of the four tested cases were 74.8 $$\%$$ % and 77.5 $$\%$$ % , which exceed the results of existing deep learning methods and improved graph cuts method. The Branches Detected(BD)/Tree-length Detected(TD) indexes also proved the global/local feature capture ability better than other methods. Conclusion The proposed model IBIMHAV-Net provides an automatic, accurate 3D liver vessel segmentation with an interleaved architecture that better utilizes both global and local spatial features in CT volumes. It can be further extended for other clinical data.

Published in BMC Medical Imaging

ISSN: 1471-2342 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Medical technology
Website: http://bmcmedimaging.biomedcentral.com

About the journal

Abstract

Keywords