IEEE Access (Jan 2024)

ASDeM: Augmenting SAM With Decoupled Memory for Video Object Segmentation

  • Xiaohu Liu,
  • Yichuang Luo,
  • Wei Sun

DOI
https://doi.org/10.1109/ACCESS.2024.3404463
Journal volume & issue
Vol. 12
pp. 73218 – 73227

Abstract

Read online

Video object segmentation models have gained impressive performance, but present low interactivity with different prompts, such as click, box or text. Some models combined with SAM in a naive manner to enhance this ability, which achieve a limited performance owing to the coarse mask and inconsistent segmentation propagation. In this paper, we propose ASDeM, which augments SAM with decoupled memory, achieving high-performance on tracking and segmentation in videos. Specifically, to explore the combination of SAM and VOS model, ASDeM fully utilizes the class-agnostic features from SAM to build the memory features for VOS model, and further the object-agnostic temporal propagation with decoupled memory is applied to address the feature staleness problem and the oblivion problem of visual information. Given the prompt for the specific object in a video, people can get satisfactory segmentation and tracking results. The experiments demonstrate the effectiveness of ASDeM on public benchmarks.

Keywords