作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (10): 313-321. doi: 10.19678/j.issn.1000-3428.0068279

• 图形图像处理 • 上一篇    下一篇

基于对抗注意力机制的水下遮挡目标检测算法

罗偲*(), 李凯扬, 吴吉花, 任鹏   

  1. 中国石油大学(华东)海洋与空间信息学院, 山东 青岛 266580
  • 收稿日期:2023-08-23 出版日期:2024-10-15 发布日期:2024-03-06
  • 通讯作者: 罗偲
  • 基金资助:
    国家重点研发计划(2021YFE0111600)

Underwater Occlusion Target Detection Algorithm Based on Adversarial Attention Mechanism

LUO Cai*(), LI Kaiyang, WU Jihua, REN Peng   

  1. College of Oceanography and Space Informatics, China University of Petroleum (East China), Qingdao 266580, Shandong, China
  • Received:2023-08-23 Online:2024-10-15 Published:2024-03-06
  • Contact: LUO Cai

摘要:

水下环境复杂, 遮挡目标信息缺失严重而难以提取到足够的特征信息, 导致水下遮挡目标易被漏检。为解决该问题, 提出一种基于对抗注意力机制的水下目标检测算法。以Faster R-CNN算法为框架, 提出基于空间注意力机制的对抗生成遮挡样本网络(AOGN)。AOGN与Faster R-CNN网络相互竞争, 通过三阶段训练过程, 在不增加推理负担的情况下学习生成检测网络难以正确区分的样本, 提高Faster R-CNN网络对水下遮挡目标的检测精度。使用Focal loss增加困难样本的损失比重, 解决水下数据集难易样本不平衡的问题。在此基础上, 为获得更丰富的水下目标特征信息, 使用SE-ResNet50代替VGG16作为骨干网络, 通过残差网络和SE模块的结合获得更有效、更丰富的水下目标信息, 提高对检测目标的特征提取能力, 同时加入多条ROIpooling支路实现多尺度特征融合, 增加特征的丰富性。实验结果表明, 该算法在URPC数据集和水下垃圾数据集上分别取得了73.76%和86.85%的平均精度均值(mAP), 遮挡目标漏检率分别达到2%和7%, 相较于其他检测算法能够有效提升检测性能。

关键词: 机器视觉, 水下目标检测, 对抗样本, 损失函数, SE-ResNet50网络, 特征融合

Abstract:

The complexity of underwater environments and the severe lack of occluded target information, make the extraction of sufficient information difficult, resulting in a high omission factor for underwater occlusion targets. To solve this problem, the present study proposes an occluded underwater target detection algorithm based on an improved adversarial attention mechanism. Using Faster R-CNN as an adversary network, the Adversarial Occlusion sample Generation Network (AOGN), which has a competitive relationship with the Faster R-CNN is designed to improve the detection accuracy for occlusion targets. Through a three-stage learning process, AOGN learns how to generate samples that are difficult for the detection network to classify correctly, thereby improving the detection accuracy of the Faster R-CNN for underwater occlusion targets. Subsequently, the Focal loss function is used to increase the proportion of difficult samples in the loss. Finally, to solve the problem of low resolution of underwater images, SE-ResNet50 is used as the backbone in place of VGG16, thereby enhancing the feature extraction ability. Furthermore, multi-scale feature fusion is adopted based on multi-ROIpooling branches to increase the richness of features. The proposed algorithm achieves mean Average Precision (mAP) values of 73.76% and 86.85% and omission factor values of 2% and 7%, on the URPC and underwater common trash datasets, respectively. These results demonstrate that the algorithm effectively outperforms existing detection methods.

Key words: machine vision, underwater target detection, adversarial sample, loss function, SE-ResNet50 network, feature fusion