Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2025, Vol. 51 ›› Issue (10): 327-335. doi: 10.19678/j.issn.1000-3428.0069648

• Graphics and Image Processing • Previous Articles     Next Articles

RGB-D Saliency Object Detection Based on Sparse Contrastive Self-Distillation

YU Yangyang1, WU Dunquan1,*(), CHEN Chenglizhao1,2   

  1. 1. College of Computer Science and Technology, Qingdao University, Qingdao 266071, Shandong, China
    2. College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 257061, Shandong, China
  • Received:2024-03-25 Revised:2024-06-13 Online:2025-10-15 Published:2024-08-15
  • Contact: WU Dunquan

基于稀疏对比自蒸馏的RGB-D显著性物体检测

于洋洋1, 吴敦全1,*(), 陈程立诏1,2   

  1. 1. 青岛大学计算机科学技术学院,山东 青岛 266071
    2. 中国石油大学(华东)计算机科学与技术学院,山东 青岛 257061
  • 通讯作者: 吴敦全
  • 基金资助:
    国家自然科学基金(62172246); 山东省高等学校青年创新团队发展计划(2021KJ062)

Abstract:

In recent years, Red Green Blue-Depth (RGB-D) saliency object detection technology has made significant progress with a notable improvement in performance. However, the dependency of the RGB-D saliency object detection technology on complex and resource-intensive architectures has limited its application in resource-constrained environments. Although lightweight networks have improved in terms of size and speed, they often come at the cost of sacrificing performance. To address this challenge, an innovative lightweight solution is proposed, which overcomes these limitations by streamlining network parameters and enhancing performance. An effective and universal training strategy and a sparse contrastive self-distillation technology are proposed, in order to compress and accelerate existing RGB-D saliency detection models while improving model performance. This strategy is primarily composed of two key technologies: sparse self-distillation and adversarial contrastive learning. Sparse self-distillation focuses on eliminating unnecessary parameters in saliency detection models while retaining key parameters, thereby achieving more efficient and effective saliency prediction. Adversarial contrastive learning, on the other hand, aims to correct potential errors, further refining the self-distillation process and improving the overall performance of the model. Experimental results on benchmark datasets such as NJUD, NLPR, LFSD, ReDWeb-S, and COME15K demonstrate that, compared to other State of The Art (SOTA) methods, our method can produce more accurate saliency detection results. Furthermore, comparison results of the proposed method with existing SOTA lightweight RGB-D saliency detection models further confirm that our method can achieve a balance between model size reduction and performance enhancement, without sacrificing performance.

Key words: Red Green Blue-Depth (RGB-D) saliency object detection, sparse self-distillation, contrast learning

摘要:

近年来,红绿蓝-深度(RGB-D)显著性目标检测技术取得了巨大进展,性能得到显著提高。然而,该技术依赖于复杂且资源密集的网络架构,无法应用于资源受限环境。虽然,轻量级网络在尺寸和速度上有所改善,但往往以牺牲性能为代价。为了克服上述限制,提出了一种新颖的轻量化解决方案,以实现网络参数的精简和性能的提升。本文提供了一种有效的通用训练策略,提出稀疏对比自蒸馏技术。该技术旨在对现有的RGB-D显著性检测模型进行压缩和加速,同时增强模型性能。本文方法由两个关键技术组成:稀疏自蒸馏和对抗性对比学习。稀疏自蒸馏排除显著性检测模型中的非必要参数,同时保留关键参数,从而实现更高效和有效的显著性预测。而对抗性对比学习通过纠正潜在错误,进一步完善自蒸馏过程,以提高模型的整体性能。在NJUD、NLPR、LFSD、ReDWeb-S和COME15K等基准数据集上的实验结果显示,与现有SOTA(State-of-The-Art)方法相比,本文方法能够产生更为准确的显著性检测结果。此外,本文方法与现有SOTA轻量级RGB-D显著性检测模型的比较结果进一步证实了本文方法在不牺牲性能的前提下能够在模型尺寸减小和性能提升之间实现平衡。

关键词: 红绿蓝-深度(RGB-D)显著性目标检测, 稀疏自蒸馏, 对比学习