Mixed-UNet: Refined class activation mapping for weakly-supervised semantic segmentation with multi-scale inference

Yang Liu; Yang Liu; Lijin Lian; Lijin Lian; Ersi Zhang; Lulu Xu; Lulu Xu; Chufan Xiao; Chufan Xiao; Xiaoyun Zhong; Xiaoyun Zhong; Fang Li; Fang Li; Bin Jiang; Yuhan Dong; Lan Ma; Lan Ma; Qiming Huang; Ming Xu; Yongbing Zhang; Dongmei Yu; Chenggang Yan; Peiwu Qin; Peiwu Qin

doi:10.3389/fcomp.2022.1036934

Frontiers in Computer Science (Nov 2022)

Mixed-UNet: Refined class activation mapping for weakly-supervised semantic segmentation with multi-scale inference

Yang Liu,
Yang Liu,
Lijin Lian,
Lijin Lian,
Ersi Zhang,
Lulu Xu,
Lulu Xu,
Chufan Xiao,
Chufan Xiao,
Xiaoyun Zhong,
Xiaoyun Zhong,
Fang Li,
Fang Li,
Bin Jiang,
Yuhan Dong,
Lan Ma,
Lan Ma,
Qiming Huang,
Ming Xu,
Yongbing Zhang,
Dongmei Yu,
Chenggang Yan,
Peiwu Qin,
Peiwu Qin

Affiliations

Yang Liu: Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Yang Liu: School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, China
Lijin Lian: Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Lijin Lian: Institute of Tsinghua-Berkeley Shenzhen, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Ersi Zhang: School of Artificial Intelligence, Hangzhou Dianzi University, Hangzhou, China
Lulu Xu: Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Lulu Xu: Institute of Tsinghua-Berkeley Shenzhen, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Chufan Xiao: Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Chufan Xiao: Institute of Tsinghua-Berkeley Shenzhen, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Xiaoyun Zhong: Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Xiaoyun Zhong: Institute of Tsinghua-Berkeley Shenzhen, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Fang Li: Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Fang Li: Institute of Tsinghua-Berkeley Shenzhen, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Bin Jiang: School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, China
Yuhan Dong: School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, China
Lan Ma: Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Lan Ma: Institute of Tsinghua-Berkeley Shenzhen, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Qiming Huang: Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Ming Xu: Department of Automation, Tsinghua University, Beijing, China
Yongbing Zhang: School of Computer Science and Technology, Shenzhen Graduate School of Harbin Institute of Technology, Shenzhen, China
Dongmei Yu: School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, China
Chenggang Yan: School of Artificial Intelligence, Hangzhou Dianzi University, Hangzhou, China
Peiwu Qin: Institute of Biopharmaceutical and Health Engineering, Tsinghua Shenzhen International Graduate School, Shenzhen, China
Peiwu Qin: Institute of Tsinghua-Berkeley Shenzhen, Tsinghua Shenzhen International Graduate School, Shenzhen, China

DOI: https://doi.org/10.3389/fcomp.2022.1036934
Journal volume & issue: Vol. 4

Abstract

Read online

Deep learning techniques have shown great potential in medical image processing, particularly through accurate and reliable image segmentation on magnetic resonance imaging (MRI) scans or computed tomography (CT) scans, which allow the localization and diagnosis of lesions. However, training these segmentation models requires a large number of manually annotated pixel-level labels, which are time-consuming and labor-intensive, in contrast to image-level labels that are easier to obtain. It is imperative to resolve this problem through weakly-supervised semantic segmentation models using image-level labels as supervision since it can significantly reduce human annotation efforts. Most of the advanced solutions exploit class activation mapping (CAM). However, the original CAMs rarely capture the precise boundaries of lesions. In this study, we propose the strategy of multi-scale inference to refine CAMs by reducing the detail loss in single-scale reasoning. For segmentation, we develop a novel model named Mixed-UNet, which has two parallel branches in the decoding phase. The results can be obtained after fusing the extracted features from two branches. We evaluate the designed Mixed-UNet against several prevalent deep learning-based segmentation approaches on our dataset collected from the local hospital and public datasets. The validation results demonstrate that our model surpasses available methods under the same supervision level in the segmentation of various lesions from brain imaging.

Published in Frontiers in Computer Science

ISSN: 2624-9898 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/computer-science#

About the journal

Abstract

Keywords