Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 19:11:e2623.
doi: 10.7717/peerj-cs.2623. eCollection 2025.

SODU2-NET: a novel deep learning-based approach for salient object detection utilizing U-NET

Affiliations

SODU2-NET: a novel deep learning-based approach for salient object detection utilizing U-NET

Hyder Abbas et al. PeerJ Comput Sci. .

Abstract

Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision. To address this challenge posed by complex backgrounds in salient object detection is crucial for advancing the field. This article proposes a novel deep learning-based architecture called SODU2-NET (Salient object detection U2-Net) for salient object detection that utilizes the U-NET base structure. This model addresses a gap in previous work that focused primarily on complex backgrounds by employing a densely supervised encoder-decoder network. The proposed SODU2-NET employs sophisticated background subtraction techniques and utilizes advanced deep learning architectures that can discern relevant foreground information when dealing with complex backgrounds. Firstly, an enriched encoder block with full feature fusion (FFF) with atrous spatial pyramid pooling (ASPP) varying dilation rates to efficiently capture multi-scale contextual information, improving salient object detection in complex backgrounds and reducing the loss of information during down-sampling. Secondly the block includes an attention module that refines the decoder, is constructed to enhances the detection of salient objects in complex backgrounds by selectively focusing attention on relevant features. This allows the model to reconstruct detailed and contextually relevant information, which is essential to determining salient objects accurately. Finally, the architecture has been improved by adding a residual block at the encoder end, which is responsible for both saliency prediction and map refinement. The proposed network is designed to learn the transformation between input images and ground truth, enabling accurate segmentation of salient object regions with clear borders and accurate prediction of fine structures. SODU2-NET is demonstrated to have superior performance in five public datasets, including DUTS, SOD, DUT OMRON, HKU-IS, PASCAL-S, and a new real world dataset, the Changsha dataset. Based on a comparative assessment of the model FCN, Squeeze-net, Deep Lab, Mask R-CNN the proposed SODU2-NET is found and achieve an improvement of precision (6%), recall (5%) and accuracy (3%). Overall, approach shows promise for improving the accuracy and efficiency of salient object detection in a variety of settings.

Keywords: ASPP module; Attention mechanism; Deep learning; Salient object detection; U-Net.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1. The proposed SODU2-NET model for salient object detection.
Figure 2
Figure 2. SEM block.
Figure 3
Figure 3. The standard architecture diagram of U-Net.
Figure 4
Figure 4. The standard architecture diagram of residual blocks.
Figure 5
Figure 5. The standard architecture diagram of ASPP.
Figure 6
Figure 6. The sample images of the DUTS dataset.
Figure 7
Figure 7. The sample images of the SOD dataset.
Figure 8
Figure 8. The sample images of the DUT-OMRON dataset.
Figure 9
Figure 9. The sample images of the HKU dataset.
Figure 10
Figure 10. The sample images of the PASCAL dataset.
Figure 11
Figure 11. Sample images of dataset capture in Changsha, Hunan, China.
Figure 12
Figure 12. Saliency map.
Figure 13
Figure 13. Confusion matrix.
Figure 14
Figure 14. Accuracy.
Figure 15
Figure 15. Precision.
Figure 16
Figure 16. F1 curve.
Figure 17
Figure 17. Mean absolute error.
Figure 18
Figure 18. Accuracy comparison in salient object detection.

Similar articles

References

    1. Aboelenein NM, Songhao P, Koubaa A, Noor A, Afifi A. HTTU-Net: hybrid two track U-Net for automatic brain tumor segmentation. IEEE Access. 2020;8:101406–101415. doi: 10.1109/ACCESS.2020.2998601. - DOI
    1. Bao Y, Dai H, Elsaddik A. Semi-supervised cross-modal salient object detection with U-structure networks. 2022. ArXiv. - DOI
    1. Bello I, Fedus W, Du X, Cubuk ED, Srinivas A, Lin T-Y, Shlens J, Zoph B. Revisiting ResNets: improved training and scaling strategies. Advances in Neural Information Processing Systems. 2021;34:22614–22627. doi: 10.48550/arXiv.2103.07579. - DOI
    1. Borji A, Cheng M-M, Hou Q, Jiang H, Li J. Salient object detection: a survey. Computational Visual Media. 2019;5:117–150. doi: 10.1007/s41095-019-0149-9. - DOI - PMC - PubMed
    1. Boulila W, Khlifi MK, Ammar A, Koubaa A, Benjdira B, Farah IR. A hybrid privacy-preserving deep learning approach for object classification in very high-resolution satellite images. Remote Sensing. 2022;14(18):4631. doi: 10.3390/rs14184631. - DOI

LinkOut - more resources