Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2023, Vol. 49 ›› Issue (6): 257-264. doi: 10.19678/j.issn.1000-3428.0064754

• Development Research and Engineering Application • Previous Articles     Next Articles

Multi-Scale Underwater Small Object Detection Based on Multi-Rate Dilated Convolution

CHEN Yuzhang1, HUANG Yizi1, ZHANG Junhan2   

  1. 1. School of Computer Science and Information Engineering, Hubei University, Wuhan 430062, China;
    2. Manchester Metropolitan Joint Institute, Hubei University, Wuhan 430062, China
  • Received:2022-05-19 Revised:2022-07-08 Published:2022-09-02

基于多速率空洞卷积的多尺度水下小目标检测

谌雨章1, 黄逸姿1, 张钧涵2   

  1. 1. 湖北大学 计算机与信息工程学院, 武汉 430062;
    2. 湖北大学 曼城联合学院, 武汉 430062
  • 作者简介:谌雨章(1984-),男,副教授、博士,主研方向为光电探测、图像处理;黄逸姿(通信作者)、张钧涵,本科生。
  • 基金资助:
    教育部产学合作协同育人项目(202101142041);大学生创新创业训练计划项目(S201910512024,S202010512080, 202010512020,X202110512069)。

Abstract: Owing to the complex imaging of underwater scenes,lower resolution,and insufficient information about small objects,extracting effective feature information is difficult,resulting in a low recognition rate and high false alarm rate for small underwater objects.To solve this problem,this paper proposes a multi-scale underwater small object detection method based on multi-rate dilated convolution.First,the DarkNet53 backbone model is used for feature extraction to obtain high-level semantic information,and a multi-rate dilated convolution module is adopted to expand the receptive field of the network,obtaining feature information in a larger pixel range by adjusting the dilated rates. Additionally,a residual structure is added to ensure detailed information on small objects for positioning.Subsequently, to restore the resolution of the small object,a deconvolution module is used to reconstruct the image details,and the detailed features are learned from feature maps with different resolutions.Finally,through Feature Pyramid Network(FPN),richer multi-scale context information is introduced into the deconvolution layer such that multiple levels of features are learned across scales to enhance the positioning and classification of small objects.Additionally,feature integration and screening are performed on the output of each layer after feature fusion to obtain the final prediction results. Experimental results show that the method achieves mAP values of 82.6% and 81.5% on the two public datasets of Pascal VOC2007 and URPC2018,respectively,and the speeds are 34.4 and 34.2 frame/s,respectively.This can effectively enhance the ability to detect small underwater objects in real time.

Key words: deep learning, underwater small object detection, dilated convolution, deconvolution network, residual network

摘要: 水下场景成像条件复杂、小目标的分辨率低且信息量少而难以提取有效的特征信息,导致水下小目标检测识别率低并且虚警率高。针对该问题,提出一种基于多速率空洞卷积的多尺度水下小目标检测方法。使用主干网络模型DarkNet53进行特征提取得到高层语义信息,采用多速率空洞卷积模块扩大网络的感受野,通过调整扩张率在更大像素范围内获取特征信息,并添加残差结构保证小目标定位的详细信息。为恢复小目标的分辨率,利用反卷积模块对图像细节进行重建,在不同分辨率的特征图上学习细节特征。在此基础上,通过特征金字塔结构将更丰富的多尺度上下文信息引入反卷积层,使多个层次的特征跨尺度学习以增强小目标的定位和分类,并对特征融合后的每一层输出进行特征整合和筛选,得到最终的预测结果。实验结果表明,该方法在Pascal VOC2007和URPC2018公共数据集上分别取得了82.6%和81.5%的mAP,在检测速度上分别达到34.4 和34.2 帧/s,能够在保证实时检测的基础上有效增强水下小目标的检测能力。

关键词: 深度学习, 水下小目标检测, 空洞卷积, 反卷积网络, 残差网络

CLC Number: