No More Training: SAM&#x2019;s Zero-Shot Transfer Capabilities for Cost-Efficient Medical Image Segmentation

Juan D. Gutierrez; Roberto Rodriguez-Echeverria; Emilio Delgado; Miguel Angel Suero Rodrigo; Fernando Sanchez-Figueroa

doi:10.1109/ACCESS.2024.3353142

IEEE Access (Jan 2024)

No More Training: SAM’s Zero-Shot Transfer Capabilities for Cost-Efficient Medical Image Segmentation

Juan D. Gutierrez,
Roberto Rodriguez-Echeverria,
Emilio Delgado,
Miguel Angel Suero Rodrigo,
Fernando Sanchez-Figueroa

Affiliations

Juan D. Gutierrez: ORCiD; Department of Electronics and Computer Science, Universidad de Santiago de Compostela, Lugo, Spain
Roberto Rodriguez-Echeverria: ORCiD; Department of Computer Systems Engineering and Telematics, i3 Laboratory @ Quercus Research Group, Universidad de Extremadura, Cáceres, Spain
Emilio Delgado: ORCiD; Department of Computer Systems Engineering and Telematics, i3 Laboratory @ Quercus Research Group, Universidad de Extremadura, Cáceres, Spain
Miguel Angel Suero Rodrigo: ORCiD; Servicio Extremeño de Salud, Hospital Universitario de Cáceres, Cáceres, Spain
Fernando Sanchez-Figueroa: ORCiD; Department of Computer Systems Engineering and Telematics, i3 Laboratory @ Quercus Research Group, Universidad de Extremadura, Cáceres, Spain

DOI: https://doi.org/10.1109/ACCESS.2024.3353142
Journal volume & issue: Vol. 12
pp. 24205 – 24216

Abstract

Read online

Semantic segmentation of medical images presents an enormous potential for diagnosis and surgery. However, achieving precise results involves designing and training complex Deep Learning (DL) models specifically for this task, which is only available to some. SAM is a model developed by Meta capable of segmenting objects present in virtually any type of image. This paper showcases SAM’s robustness and exceptional performance in medical image segmentation, even in the absence of direct training on these image types (lung Computed Tomographies (CTs) and chest X-rays, in particular). Additionally, it achieves this impressive outcome while requiring minimal user intervention. Although the dataset used to train SAM does not contain a single sample of both medical image types, processing a popular dataset comprised of $\mathrm {20 }$ volumes with a total of $\mathrm {3520 }$ slices using the ViT-L version of the model yields an average Jaccard index of $\mathrm {91.45 \%}$ and an average Dice score of $\mathrm {94.95 \%}$ . The same version of the model achieves a $\mathrm {93.19 \%}$ Dice score and a $\mathrm {87.45 \%}$ Jaccard index when segmenting a frequently-used chest X-ray dataset. The values obtained are above the $\mathrm {70 \%}$ mark recommended in the literature, and close to state-of-the art models developed specifically for medical segmentation. These results are achieved without user interaction by providing the model with positive prompts based on the masks of the dataset used and a negative prompt located in the center of bounding box that contains the masks.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords