Few-shot segmentation with duplex network and attention augmented module

doi:10.3389/fnbot.2023.1206189

. 2023 Jun 21:17:1206189.

doi: 10.3389/fnbot.2023.1206189. eCollection 2023.

Few-shot segmentation with duplex network and attention augmented module

Sifu Zeng¹, Jie Yang², Wang Luo³, Yudi Ruan²

Affiliations

¹ School of Economics and Management, Chongqing Jiaotong University, Chongqing, China.
² School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, China.
³ College of River and Ocean Engineering, Chongqing Jiaotong University, Chongqing, China.

PMID: 37416851
PMCID: PMC10320285
DOI: 10.3389/fnbot.2023.1206189

Few-shot segmentation with duplex network and attention augmented module

Sifu Zeng et al. Front Neurorobot. 2023.

. 2023 Jun 21:17:1206189.

doi: 10.3389/fnbot.2023.1206189. eCollection 2023.

Authors

Sifu Zeng¹, Jie Yang², Wang Luo³, Yudi Ruan²

Affiliations

¹ School of Economics and Management, Chongqing Jiaotong University, Chongqing, China.
² School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, China.
³ College of River and Ocean Engineering, Chongqing Jiaotong University, Chongqing, China.

PMID: 37416851
PMCID: PMC10320285
DOI: 10.3389/fnbot.2023.1206189

Abstract

Establishing the relationship between a limited number of samples and segmented objects in diverse scenarios is the primary challenge in few-shot segmentation. However, many previous works overlooked the crucial support-query set interaction and the deeper information that needs to be explored. This oversight can lead to model failure when confronted with complex scenarios, such as ambiguous boundaries. To solve this problem, a duplex network that utilizes the suppression and focus concept is proposed to effectively suppress the background and focus on the foreground. Our network includes dynamic convolution to enhance the support-query interaction and a prototype match structure to fully extract information from support and query. The proposed model is called dynamic prototype mixture convolutional networks (DPMC). To minimize the impact of redundant information, we have incorporated a hybrid attentional module called double-layer attention augmented convolutional module (DAAConv) into DPMC. This module enables the network to concentrate more on foreground information. Our experiments on PASCAL-5i and COCO-20i datasets suggested that DPMC and DAAConv outperform traditional prototype-based methods by up to 5-8% on average.

Keywords: attention module; duplex mode; few-shot segmentation; mixture models; semantic segmentation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Visualization of duplex networks and previous networks.

**Figure 2**
Overall structure of our method with the double-layer attention augmented convolutional module (DAAConv) and the dynamic prototype mixture convolutional network (DPMC).

**Figure 3**
Visual illustration of our proposed PMS.

**Figure 4**
Segmentation results of DPCN, DPMC⁺, and DPMC. DPCN represents the method used in Liu et al. (2022). The method does not use duplex networks. DPMC⁺ represents the working path that only uses the foreground (i.e., the working path where the PMS is located). DPMC represents our complete duplex network.

**Figure 5**
Segmentation results of our model and baseline.

See this image and copyright information in PMC

References

1. Ao W., Zheng S., Meng Y. (2022). Few-shot semantic segmentation via mask aggregation. arXiv:2202.07231. 10.48550/arXiv.2202.07231 - DOI
1. Bello I., Zoph B., Vaswani A., Shlens J., Le Q.V. (2019). “Attention augmented convolutional networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (Seoul: IEEE; ), 3286–3295. 10.1109/ICCV.2019.00338 - DOI
1. Boudiaf M., Kervadec H., Masud Z.I., Piantanida P., Ben Ayed I., Dolz J. (2021). “Few-shot segmentation without meta-learning: a good transductive inference is all you need?,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (Nashville, TN: IEEE; ), 13979–13988. 10.1109/CVPR46437.2021.01376 - DOI
1. Chen C.-F.R., Fan Q., Panda R. (2021). “Crossvit: cross-attention multi-scale vision transformer for image classification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (Montreal, QC: IEEE; ), 357–366. 10.1109/ICCV48922.2021.00041 - DOI
1. Chen L.-C., Papandreou G., Kokkinos I., Murphy K., Yuille A.L. (2017a). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 834–848. 10.1109/TPAMI.2017.2699184 - DOI - PubMed

LinkOut - more resources

Full Text Sources

[1] Ao W., Zheng S., Meng Y. (2022). Few-shot semantic segmentation via mask aggregation. arXiv:2202.07231. 10.48550/arXiv.2202.07231 - DOI

[2] Ao W., Zheng S., Meng Y. (2022). Few-shot semantic segmentation via mask aggregation. arXiv:2202.07231. 10.48550/arXiv.2202.07231 - DOI

[3] Bello I., Zoph B., Vaswani A., Shlens J., Le Q.V. (2019). “Attention augmented convolutional networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (Seoul: IEEE; ), 3286–3295. 10.1109/ICCV.2019.00338 - DOI

[4] Bello I., Zoph B., Vaswani A., Shlens J., Le Q.V. (2019). “Attention augmented convolutional networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (Seoul: IEEE; ), 3286–3295. 10.1109/ICCV.2019.00338 - DOI

[5] Boudiaf M., Kervadec H., Masud Z.I., Piantanida P., Ben Ayed I., Dolz J. (2021). “Few-shot segmentation without meta-learning: a good transductive inference is all you need?,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (Nashville, TN: IEEE; ), 13979–13988. 10.1109/CVPR46437.2021.01376 - DOI

[6] Boudiaf M., Kervadec H., Masud Z.I., Piantanida P., Ben Ayed I., Dolz J. (2021). “Few-shot segmentation without meta-learning: a good transductive inference is all you need?,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (Nashville, TN: IEEE; ), 13979–13988. 10.1109/CVPR46437.2021.01376 - DOI

[7] Chen C.-F.R., Fan Q., Panda R. (2021). “Crossvit: cross-attention multi-scale vision transformer for image classification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (Montreal, QC: IEEE; ), 357–366. 10.1109/ICCV48922.2021.00041 - DOI

[8] Chen C.-F.R., Fan Q., Panda R. (2021). “Crossvit: cross-attention multi-scale vision transformer for image classification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (Montreal, QC: IEEE; ), 357–366. 10.1109/ICCV48922.2021.00041 - DOI

[9] Chen L.-C., Papandreou G., Kokkinos I., Murphy K., Yuille A.L. (2017a). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 834–848. 10.1109/TPAMI.2017.2699184 - DOI - PubMed

[10] Chen L.-C., Papandreou G., Kokkinos I., Murphy K., Yuille A.L. (2017a). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 834–848. 10.1109/TPAMI.2017.2699184 - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Few-shot segmentation with duplex network and attention augmented module

Affiliations

Few-shot segmentation with duplex network and attention augmented module

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

References

Related information

LinkOut - more resources

Full Text Sources