Application of Multimodal Feature Selection-Based Scene Recognition for Medical Education

Yiyi Zhang; Li-Mei Liu; Ying Zhu

doi:10.1109/ACCESS.2024.3409686

IEEE Access (Jan 2024)

Application of Multimodal Feature Selection-Based Scene Recognition for Medical Education

Yiyi Zhang,
Li-Mei Liu,
Ying Zhu

Affiliations

Yiyi Zhang: ORCiD; Department of Endocrinology, Sichuan Provincial People’s Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
Li-Mei Liu: Department of Endocrinology, Sichuan Provincial People’s Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
Ying Zhu: ORCiD; Department of Endocrinology, Sichuan Provincial People’s Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, Sichuan, China

DOI: https://doi.org/10.1109/ACCESS.2024.3409686
Journal volume & issue: Vol. 12
pp. 87934 – 87943

Abstract

Read online

Computed tomography (CT) imaging is widely applied to identify tumors and internal injuries in the body. CT image retargeting involves reducing the size of less important areas (like normal organs) either horizontally or vertically, while maintaining the size of key areas (such as affected organs) in the CT image. This retargeting process can significantly improve the way CT images are displayed, which can be advantageous for further educational tasks, e.g., optimally displaying CT images to medical students. This paper focuses on effectively merging multiple channels of perceptual visual features for retargeting CT images with intricate spatial layouts. The key objective is to develop an active learning model that identifies where human attention is directed in a scenery. To capture both semantically and visually important elements in each CT image, we use the BING objectness descriptor, which quickly and accurately localize objects or their parts across multiple scales. Herein, multi-channel low-level features describe each object-aware patch. In an effort to characterize how humans perceive important scenic elements, we propose a locality-preserved and interactive active learning (LIAL) approach. This method sequentially generates gaze shift paths (GSP) for each CT image. Due to the high dimensionality of such GSP feature, a distribution-preserved feature selection (DPFS) is designed to acquire a few highly discriminative GSP features. Last, these refined GSP features are combined by a Gaussian mixture model (GMM) to retarget CT images. Extensive empirical studies demonstrate the power of the designed method.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords