Alexandria Engineering Journal (Jan 2025)
SF-SAM-Adapter: SAM-based segmentation model integrates prior knowledge for gaze image reflection noise removal
Abstract
Gaze tracking technology in HMDs (Head-Mounted Displays) suffers from decreased accuracy due to highlight reflection noise from users' glasses. To address this, we present a denoising method which directly pinpoints the noisy regions through advanced segmentation models and then fills the flawed regions through advanced image inpainting algorithms. In segmentation stage, we introduce a novel model based on the recently proposed segmentation large model SAM (Segment Anything Model), called SF-SAM-Adapter (Spatial and Frequency aware SAM Adapter). It injects prior knowledge regarding the strip-like shaped in spatial and high-frequency in frequency of reflection noise into SAM by integrating specially designed trainable adapter modules into the original structure, while retaining the expressive power of the large model and better adapting to the downstream task. We achieved segmentation metrics of IoU (Intersection over Union) = 0.749 and Dice = 0.853 at a memory size of 13.9 MB, outperforming recent techniques, including UNet, UNet++, BATFormer, FANet, MSA, and SAM2-Adapter. In inpainting, we employ the advanced inpainting algorithm LAMA (Large Mask inpainting), resulting in significant improvements in gaze tracking accuracy by 0.502°, 0.182°, and 0.319° across three algorithms. The code and datasets used in current study are available in the repository: https://github.com/leiting5297/SF-SAM-Adapter.git.