作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

语义不确定性区域增强的弱监督语义分割方法

  • 发布日期:2026-06-18

Weakly Supervised Semantic Segmentation with Semantic Uncertainty Region Enhancement

  • Published:2026-06-18

摘要: 弱监督语义分割通常利用类激活图(Class Activation Maps, CAMs)生成伪标签以训练分割网络。然而,由于CAM源于图像级分类任务,其响应往往集中在目标显著区域,导致前景激活不完整;同时,CAM在目标边界及复杂结构区域的响应不稳定,易引入伪标签噪声,从而限制分割性能的提升。针对上述问题,提出了一种语义不确定性区域增强的单阶段弱监督语义分割方法。首先,设计基于语义不确定性区域的对比学习模块,通过融合多种不确定性信息对CAM中语义不确定性区域进行细粒度建模,以增强前景激活完整性。其次,引入动态自适应高斯去噪模块,通过动态阈值调整与高斯混合去噪策略,对伪标签噪声进行自适应识别与逐步去除,从而抑制伪标签噪声。实验结果表明,在仅使用图像级标签监督的条件下,所提方法在PASCAL VOC 2012验证集和测试集上mIoU分别达到72.2%和72.8%,在MS COCO 2014数据集上达到42.5%。消融实验进一步表明,单独引入语义不确定性区域对比学习模块与动态自适应高斯去噪模块后,mIoU分别提升1.6%与2.5%,验证了两模块在增强前景完整性与抑制伪标签噪声的有效性,从而提升了模型整体分割性能。

Abstract: Weakly supervised semantic segmentation commonly utilizes Class Activation Maps (CAMs) to generate pseudo-labels for training segmentation networks. However, since CAMs originate from image-level classification tasks, their responses tend to concentrate on salient object regions, resulting in incomplete foreground activation. Meanwhile, CAM responses in object boundary regions and complex structural areas are unstable, which easily introduces pseudo-label noise and limits segmentation performance improvement. To address the above problems, this paper proposes a single-stage weakly supervised semantic segmentation method with semantic uncertainty region enhancement. First, this paper designs a contrastive learning module based on semantic uncertainty regions. The module fuses multiple uncertainty cues to perform fine-grained modeling of semantic uncertainty regions in CAMs, thereby enhancing foreground activation completeness. Second, this paper introduces a dynamic adaptive Gaussian denoising module. The module applies dynamic threshold adjustment and Gaussian mixture denoising strategies to adaptively identify and progressively remove pseudo-label noise, thereby suppressing pseudo-label noise. Experimental results demonstrate that, under the supervision of image-level labels only, the proposed method achieves mIoU scores of 72.2% and 72.8% on the PASCAL VOC 2012 validation set and test set respectively, and achieves 42.5% on the MS COCO 2014 dataset. Ablation experiments further demonstrate that the separate introduction of the semantic uncertainty region contrastive learning module and the dynamic adaptive Gaussian denoising module improves mIoU by 1.6% and 2.5% respectively. The results verify the effectiveness of both modules in enhancing foreground completeness and suppressing pseudo-label noise, thus improving the overall segmentation performance of the model.