SODU2-NET: a novel deep learning-based approach for salient object detection utilizing U-NET

Hyder Abbas; Shen Bing Ren; Muhammad Asim; Syeda Iqra Hassan; Ahmed A. Abd El-Latif

doi:10.7717/peerj-cs.2623

PeerJ Computer Science (May 2025)

SODU2-NET: a novel deep learning-based approach for salient object detection utilizing U-NET

Hyder Abbas,
Shen Bing Ren,
Muhammad Asim,
Syeda Iqra Hassan,
Ahmed A. Abd El-Latif

Affiliations

Hyder Abbas: State Key Laboratory of Public Big Data, College of Computer Science and Technology, Institute for Artificial Intelligence, Guizhou University, Guiyang, Guizhou, China
Shen Bing Ren: School of Computer Science and Engineering, Central South University, Changsha, China
Muhammad Asim: EIAS Data Science and Blockchain Laboratory, College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia
Syeda Iqra Hassan: Department of Electrical and Electronic Engineering, British Malaysian Institute, Universiti of Kuala Lumpur, Kuala Lumpur, Malaysia
Ahmed A. Abd El-Latif: EIAS Data Science and Blockchain Laboratory, College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia

DOI: https://doi.org/10.7717/peerj-cs.2623
Journal volume & issue: Vol. 11
p. e2623

Abstract

Read online Read online

Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision. To address this challenge posed by complex backgrounds in salient object detection is crucial for advancing the field. This article proposes a novel deep learning-based architecture called SODU2-NET (Salient object detection U2-Net) for salient object detection that utilizes the U-NET base structure. This model addresses a gap in previous work that focused primarily on complex backgrounds by employing a densely supervised encoder-decoder network. The proposed SODU2-NET employs sophisticated background subtraction techniques and utilizes advanced deep learning architectures that can discern relevant foreground information when dealing with complex backgrounds. Firstly, an enriched encoder block with full feature fusion (FFF) with atrous spatial pyramid pooling (ASPP) varying dilation rates to efficiently capture multi-scale contextual information, improving salient object detection in complex backgrounds and reducing the loss of information during down-sampling. Secondly the block includes an attention module that refines the decoder, is constructed to enhances the detection of salient objects in complex backgrounds by selectively focusing attention on relevant features. This allows the model to reconstruct detailed and contextually relevant information, which is essential to determining salient objects accurately. Finally, the architecture has been improved by adding a residual block at the encoder end, which is responsible for both saliency prediction and map refinement. The proposed network is designed to learn the transformation between input images and ground truth, enabling accurate segmentation of salient object regions with clear borders and accurate prediction of fine structures. SODU2-NET is demonstrated to have superior performance in five public datasets, including DUTS, SOD, DUT OMRON, HKU-IS, PASCAL-S, and a new real world dataset, the Changsha dataset. Based on a comparative assessment of the model FCN, Squeeze-net, Deep Lab, Mask R-CNN the proposed SODU2-NET is found and achieve an improvement of precision (6%), recall (5%) and accuracy (3%). Overall, approach shows promise for improving the accuracy and efficiency of salient object detection in a variety of settings.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords