语义不确定性区域增强的弱监督语义分割方法

doi:10.19678/j.issn.1000-3428.0260201

摘要/Abstract

摘要： 弱监督语义分割通常利用类激活图（Class Activation Maps, CAMs）生成伪标签以训练分割网络。然而，由于CAM源于图像级分类任务，其响应往往集中在目标显著区域，导致前景激活不完整；同时，CAM在目标边界及复杂结构区域的响应不稳定，易引入伪标签噪声，从而限制分割性能的提升。针对上述问题，提出了一种语义不确定性区域增强的单阶段弱监督语义分割方法。首先，设计基于语义不确定性区域的对比学习模块，通过融合多种不确定性信息对CAM中语义不确定性区域进行细粒度建模，以增强前景激活完整性。其次，引入动态自适应高斯去噪模块，通过动态阈值调整与高斯混合去噪策略，对伪标签噪声进行自适应识别与逐步去除，从而抑制伪标签噪声。实验结果表明，在仅使用图像级标签监督的条件下，所提方法在PASCAL VOC 2012验证集和测试集上mIoU分别达到72.2%和72.8%，在MS COCO 2014数据集上达到42.5%。消融实验进一步表明，单独引入语义不确定性区域对比学习模块与动态自适应高斯去噪模块后，mIoU分别提升1.6%与2.5%，验证了两模块在增强前景完整性与抑制伪标签噪声的有效性，从而提升了模型整体分割性能。

Abstract: Weakly supervised semantic segmentation commonly utilizes Class Activation Maps (CAMs) to generate pseudo-labels for training segmentation networks. However, since CAMs originate from image-level classification tasks, their responses tend to concentrate on salient object regions, resulting in incomplete foreground activation. Meanwhile, CAM responses in object boundary regions and complex structural areas are unstable, which easily introduces pseudo-label noise and limits segmentation performance improvement. To address the above problems, this paper proposes a single-stage weakly supervised semantic segmentation method with semantic uncertainty region enhancement. First, this paper designs a contrastive learning module based on semantic uncertainty regions. The module fuses multiple uncertainty cues to perform fine-grained modeling of semantic uncertainty regions in CAMs, thereby enhancing foreground activation completeness. Second, this paper introduces a dynamic adaptive Gaussian denoising module. The module applies dynamic threshold adjustment and Gaussian mixture denoising strategies to adaptively identify and progressively remove pseudo-label noise, thereby suppressing pseudo-label noise. Experimental results demonstrate that, under the supervision of image-level labels only, the proposed method achieves mIoU scores of 72.2% and 72.8% on the PASCAL VOC 2012 validation set and test set respectively, and achieves 42.5% on the MS COCO 2014 dataset. Ablation experiments further demonstrate that the separate introduction of the semantic uncertainty region contrastive learning module and the dynamic adaptive Gaussian denoising module improves mIoU by 1.6% and 2.5% respectively. The results verify the effectiveness of both modules in enhancing foreground completeness and suppressing pseudo-label noise, thus improving the overall segmentation performance of the model.

刘洲峰, 李慧敏, 丁淑敏, 徐艳芝, 李春雷. 语义不确定性区域增强的弱监督语义分割方法[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0260201.

Zhoufeng Liu, Huimin Li, Shumin Ding, Yanzhi Xu, Chunlei Li. Weakly Supervised Semantic Segmentation with Semantic Uncertainty Region Enhancement[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0260201.

参考文献

[1] Ahn J, Kwak S. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 4981-4990.
[2] Dai J, He K, Sun J. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1635-1643.
[3] Bearman A, Russakovsky O, Ferrari V, et al. What’s the point: Semantic segmentation with point supervision[C]//European conference on computer vision. Cham: Springer International Publishing, 2016: 549-565.
[4] Lin D, Dai J, Jia J, et al. Scribblesup: Scribble-supervised convolutional networks for semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 3159-3167.
[5] 刘洲峰,李冰芮,杨瑞敏,等.基于调制-全局推理的弱监督语义分割算法研究[J].计算机工程,2025,51(02):344-355.DOI:10.19678/j.issn.1000-3428.0068781. LIU Zhoufeng, LI Bingrui, YANG Ruimin, et al. Research on Weakly Supervised Semantic Segmentation Algorithm Based on Modulation-Global Reasoning[J]. Computer Engineering, 2025, 51(02): 344-355. DOI: 10.19678/j.issn.1000-3428.0068781.
[6] Chen Z, Wang T, Wu X, et al. Class re-activation maps for weakly-supervised semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 969-978.
[7] Rong S, Tu B, Wang Z, et al. Boundary-enhanced co-training for weakly supervised semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 19574-19584.
[8] Jiang P T, Yang Y, Hou Q, et al. L2g: A simple local-to- global knowledge transfer framework for weakly supervised semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 16886-16896.
[9] Xu R, Wang C, Xu S, et al. RML: Efficient Representation Mutual Learning Framework for End-to-End Weakly-Supervised Semantic Segmentation[J]. IEEE Transactions on Instrumentation and Measurement, 2025.
[10] Araslanov N, Roth S. Single-stage semantic segmentation from image labels[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 4253-4262.
[11] Kim Y W, Kim W. Clustering-guided class activation for weakly supervised semantic segmentation[J]. IEEE Access, 2024, 12: 4871-4880.
[12] Shao X, Han J, Li L, et al. CPEWS: contextual prototype-based end-to-end weakly supervised semantic segmentation[J]. Computers, Materials, & Continua, 2025, 83(1): 595.
[13] Jo S, Yu I J. Puzzle-cam: Improved localization via matching partial and full features[C]//2021 IEEE international conference on image processing (ICIP). IEEE, 2021: 639-643.
[14] Wang Y, Zhang J, Kan M, et al. Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 12275-12284.
[15] Huang Z, Wang X, Wang J, et al. Weakly-supervised semantic segmentation network with deep seeded region growing[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7014-7023.
[16] 李军侠,王星驰,殷梓,等.边缘深度挖掘的弱监督显著性目标检测[J].计算机工程,2023,49(07):169-178.DOI:10.19678/j.issn.1000-3428.0065413. LI Junxia, WANG Xingchi, YIN Zi, et al. Weakly Supervised Salient Object Detection with Deep Edge Mining[J]. Computer Engineering, 2023, 49(07): 169-178. DOI: 10.19678/j.issn.1000-3428.0065413.
[17] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.
[18] Xu L, Ouyang W, Bennamoun M, et al. Multi-class token transformer for weakly supervised semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 4310-4319.
[19] Ru L, Zhan Y, Yu B, et al. Learning affinity from attention: End-to-end weakly-supervised semantic segmentation with transformers[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 16846- 16855.
[20] Zhang B, Xiao J, Wei Y, et al. Reliability does matter: An end-to-end weakly supervised semantic segmentation approach[C]//Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 12765-12772.
[21] Xu R, Wang C, Sun J, et al. Self correspondence distillation for end-to-end weakly-supervised semantic segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2023, 37(3): 3045-3053.
[22] Ru L, Zheng H, Zhan Y, et al. Token contrast for weakly- supervised semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 3093-3102.
[23] Zhu L, Li Y, Fang J, et al. Weaktr: Exploring plain vision transformer for weakly-supervised semantic segmentation[J]. arXiv preprint arXiv:2304.01184, 2023.
[24] He J, Cheng L, Fang C, et al. Progressive feature self- reinforcement for weakly supervised semantic segmentation[C]//Proceedings of the AAAI conference on artificial intelligence. 2024, 38(3): 2085-2093.
[25] 徐海喆,黄凌霄,姚新波,等.多模态对比学习在弱监督语义分割的方法研究[J/OL].计算机工程,1-16[2026-04-27].https://doi.org/10.19678/j.issn.1000-3428.0252846. XU Haizhe, HUANG Lingxiao, YAO Xinbo, et al. Research on Multimodal Contrastive Learning Methods for Weakly Supervised Semantic Segmentation[J/OL]. Computer Engineering, 1-16[2026-04-27]. https://doi.org/10.19678/j.issn.1000-3428.0252846.
[26] Wu Y, Ye X, Yang K, et al. Dupl: Dual student with trustworthy progressive learning for robust weakly supervised semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 3534-3543.
[27] Everingham M, Van Gool L, Williams C K I, et al. The pascal visual object classes (voc) challenge[J]. International journal of computer vision, 2010, 88(2): 303-338.
[28] Lin T Y, Maire M, Belongie S, et al. Microsoft coco: Common objects in context[C]//European conference on computer vision. Cham: Springer International Publishing, 2014: 740-755.
[29] Pan J, Zhu P, Zhang K, et al. Learning self-supervised low- rank network for single-stage weakly and semi-supervised semantic segmentation[J]. International Journal of Computer Vision, 2022, 130(5): 1181-1195.
[30] Gu W, Li K, Zhang B, et al. An End-To-End Class-Aware Transformer Framework For Weakly-Supervised Semantic Segmentation[C]//2025 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2025: 1-6.
[31] Lee J, Kim E, Yoon S. Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 4071-4080.
[32] Xu L, Bennamoun M, Boussaid F, et al. Auxiliary tasks enhanced dual-affinity learning for weakly supervised semantic segmentation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 36(3): 5082-5096.
[33] Liu Z, Li B, Yu M, et al. Enhanced Foreground–Background Discrimination for Weakly Supervised Semantic Segmentation[J]. IET Computer Vision, 2025, 19(1): e70029.
[34] Zhang J, Peng B, Wu X. CDGR: Cross-Modal Dual Graph Reasoning for Weakly Supervised Semantic Segmentation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2025.
[35] Chen T, Yao Y, Tang J. Multi-granularity denoising and bidirectional alignment for weakly supervised semantic segmentation[J]. IEEE Transactions on Image Processing, 2023, 32: 2960-2971.
[36] Zhu L, Zhang X, He H, et al. Branches mutual promotion for end-to-end weakly supervised semantic segmentation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024.
[37] Zhang J, Peng B, Wu X, et al. Weakly supervised semantic segmentation by knowledge graph inference[J]. Engineering Applications of Artificial Intelligence, 2024, 138: 109294.
[38] Zhang J, Peng B, Wu X. Dual graph inference network for weakly supervised semantic segmentation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2025.

选择文件类型/文献管理软件名称

选择包含的内容