IEEE Access (Jan 2024)

Application of Multimodal Feature Selection-Based Scene Recognition for Medical Education

  • Yiyi Zhang,
  • Li-Mei Liu,
  • Ying Zhu

DOI
https://doi.org/10.1109/ACCESS.2024.3409686
Journal volume & issue
Vol. 12
pp. 87934 – 87943

Abstract

Read online

Computed tomography (CT) imaging is widely applied to identify tumors and internal injuries in the body. CT image retargeting involves reducing the size of less important areas (like normal organs) either horizontally or vertically, while maintaining the size of key areas (such as affected organs) in the CT image. This retargeting process can significantly improve the way CT images are displayed, which can be advantageous for further educational tasks, e.g., optimally displaying CT images to medical students. This paper focuses on effectively merging multiple channels of perceptual visual features for retargeting CT images with intricate spatial layouts. The key objective is to develop an active learning model that identifies where human attention is directed in a scenery. To capture both semantically and visually important elements in each CT image, we use the BING objectness descriptor, which quickly and accurately localize objects or their parts across multiple scales. Herein, multi-channel low-level features describe each object-aware patch. In an effort to characterize how humans perceive important scenic elements, we propose a locality-preserved and interactive active learning (LIAL) approach. This method sequentially generates gaze shift paths (GSP) for each CT image. Due to the high dimensionality of such GSP feature, a distribution-preserved feature selection (DPFS) is designed to acquire a few highly discriminative GSP features. Last, these refined GSP features are combined by a Gaussian mixture model (GMM) to retarget CT images. Extensive empirical studies demonstrate the power of the designed method.

Keywords