Multi-Scale Underwater Small Object Detection Based on Multi-Rate Dilated Convolution

doi:10.19678/j.issn.1000-3428.0064754

Abstract

Abstract: Owing to the complex imaging of underwater scenes，lower resolution，and insufficient information about small objects，extracting effective feature information is difficult，resulting in a low recognition rate and high false alarm rate for small underwater objects.To solve this problem，this paper proposes a multi-scale underwater small object detection method based on multi-rate dilated convolution.First，the DarkNet53 backbone model is used for feature extraction to obtain high-level semantic information，and a multi-rate dilated convolution module is adopted to expand the receptive field of the network，obtaining feature information in a larger pixel range by adjusting the dilated rates. Additionally，a residual structure is added to ensure detailed information on small objects for positioning.Subsequently， to restore the resolution of the small object，a deconvolution module is used to reconstruct the image details，and the detailed features are learned from feature maps with different resolutions.Finally，through Feature Pyramid Network（FPN），richer multi-scale context information is introduced into the deconvolution layer such that multiple levels of features are learned across scales to enhance the positioning and classification of small objects.Additionally，feature integration and screening are performed on the output of each layer after feature fusion to obtain the final prediction results. Experimental results show that the method achieves mAP values of 82.6% and 81.5% on the two public datasets of Pascal VOC2007 and URPC2018，respectively，and the speeds are 34.4 and 34.2 frame/s，respectively.This can effectively enhance the ability to detect small underwater objects in real time.

Key words: deep learning, underwater small object detection, dilated convolution, deconvolution network, residual network

摘要： 水下场景成像条件复杂、小目标的分辨率低且信息量少而难以提取有效的特征信息，导致水下小目标检测识别率低并且虚警率高。针对该问题，提出一种基于多速率空洞卷积的多尺度水下小目标检测方法。使用主干网络模型DarkNet53进行特征提取得到高层语义信息，采用多速率空洞卷积模块扩大网络的感受野，通过调整扩张率在更大像素范围内获取特征信息，并添加残差结构保证小目标定位的详细信息。为恢复小目标的分辨率，利用反卷积模块对图像细节进行重建，在不同分辨率的特征图上学习细节特征。在此基础上，通过特征金字塔结构将更丰富的多尺度上下文信息引入反卷积层，使多个层次的特征跨尺度学习以增强小目标的定位和分类，并对特征融合后的每一层输出进行特征整合和筛选，得到最终的预测结果。实验结果表明，该方法在Pascal VOC2007和URPC2018公共数据集上分别取得了82.6%和81.5%的mAP，在检测速度上分别达到34.4 和34.2 帧/s，能够在保证实时检测的基础上有效增强水下小目标的检测能力。

关键词: 深度学习, 水下小目标检测, 空洞卷积, 反卷积网络, 残差网络

CLC Number:

TP18

CHEN Yuzhang, HUANG Yizi, ZHANG Junhan. Multi-Scale Underwater Small Object Detection Based on Multi-Rate Dilated Convolution[J]. Computer Engineering, 2023, 49(6): 257-264.

谌雨章, 黄逸姿, 张钧涵. 基于多速率空洞卷积的多尺度水下小目标检测[J]. 计算机工程, 2023, 49(6): 257-264.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0064754

http://www.ecice06.com/EN/Y2023/V49/I6/257

Figures/Tables 12

References

[1] 黄继鹏,史颖欢,高阳.面向小目标的多尺度Faster-RCNN检测算法[J].计算机研究与发展,2019,56(2):319-327.HUANG J P,SHI Y H,GAO Y.Multi-scale Faster-RCNN algorithm for small object detection[J].Journal of Computer Research and Development,2019,56(2):319-327.(in Chinese)
[2] 赵亚男,吴黎明,陈琦.基于多尺度融合SSD的小目标检测算法[J].计算机工程,2020,46(1):247-254.ZHAO Y N,WU L M,CHEN Q.Small object detection algorithm based on multi-scale fusion SSD[J].Computer Engineering,2020,46(1):247-254.(in Chinese)
[3] 鞠默然,罗江宁,王仲博,等.融合注意力机制的多尺度目标检测算法[J].光学学报,2020,40(13):132-140.JU M R,LUO J N,WANG Z B,et al.Multi-scale target detection algorithm based on attention mechanism[J].Acta Optica Sinica,2020,40(13):132-140.(in Chinese)
[4] GUO C X,FAN B,ZHANG Q,et al.AugFPN:improving multi-scale feature learning for object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2020:12592-12601.
[5] FAN B J,CHEN W,CONG Y,et al.Dual refinement underwater object detection network[C]//Proceedings of European Conference on Computer Vision.Berlin,Germany:Springer,2020:275-291.
[6] CHEN K,CAO Y H,LOY C C,et al.Feature pyramid grids[EB/OL].[2022-04-02].https://arxiv.org/abs/2004.03580.
[7] YOU H F,YU L,TIAN S W,et al.MC-Net:multiple max-pooling integration module and cross multi-scale deconvolution network[J].Knowledge-Based Systems,2021,231:107456.
[8] CHENG B W,XIAO B,WANG J D,et al.HigherHRNet:scale-aware representation learning for bottom-up human pose estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2020:5385-5394.
[9] 黄硕,胡勇,顾明剑,等.基于深度学习的红外遥感目标超分辨率检测算法[J].激光与光电子学进展,2021,58(16):288-296.HUANG S,HU Y,GU M J,et al.Super-resolution infrared remote-sensing target-detection algorithm based on deep learning[J].Laser & Optoelectronics Progress,2021,58(16):288-296.(in Chinese)
[10] BAI Y C,ZHANG Y Q,DING M L,et al.SOD-MTGAN:small object detection via multi-task generative adversarial network[C]//Proceedings of the 15th European Conference on Computer Vision.New York,USA:ACM Press,2018:210-226.
[11] NOH J,BAE W,LEE W,et al.Better to follow,follow to be better:towards precise supervision of feature super-resolution for small object detection[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2020:9724-9733.
[12] 张海涛,张梦.引入通道注意力机制的SSD目标检测算法[J].计算机工程,2020,46(8):264-270.ZHANG H T,ZHANG M.SSD target detection algorithm with channel attention mechanism[J].Computer Engineering,2020,46(8):264-270.(in Chinese)
[13] XUE Z J,CHEN W J,LI J.Enhancement and fusion of multi-scale feature maps for small object detection[C]//Proceedings of the 39th Chinese Control Conference.Washington D.C.,USA:IEEE Press,2020:7212-7217.
[14] REDMON J,FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2022-04-02].https://arxiv.org/abs/1804.02767.
[15] 陈科圻,朱志亮,邓小明,等.多尺度目标检测的深度学习研究综述[J].软件学报,2021,32(4):1201-1227.CHEN K Q,ZHU Z L,DENG X M,et al.Deep learning for multi-scale object detection:a survey[J].Journal of Software,2021,32(4):1201-1227.(in Chinese)
[16] WANG P Q,CHEN P F,YUAN Y,et al.Understanding convolution for semantic segmentation[C]//Proceedings of IEEE Winter Conference on Applications of Computer Vision.Washington D.C.,USA:IEEE Press,2018:1451-1460.
[17] CHEN L,ZHOU F X,WANG S K,et al.SWIPENET:object detection in noisy underwater images[EB/OL].[2022-04-02].https://arxiv.org/abs/2010.10006.
[18] 彭亚丽,张鲁,张钰,等.基于深度反卷积神经网络的图像超分辨率算法[J].软件学报,2018,29(4):926-934.PENG Y L,ZHANG L,ZHANG Y,et al.Deep deconvolution neural network for image super-resolution[J].Journal of Software,2018,29(4):926-934.(in Chinese)
[19] ZHANG W,WANG S H,THACHAN S,et al.Deconv R-CNN for small object detection on remote sensing images[C]//Proceedings of 2018 IEEE International Geoscience and Remote Sensing Symposium.Washington D.C.,USA:IEEE Press,2018:2483-2486.
[20] GHIASI G,CUI Y,SRINIVAS A,et al.Simple copy-paste is a strong data augmentation method for instance segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2021:2917-2927.
[21] SINGH B,NAJIBI M,DAVIS L S.SNIPER:efficient multi-scale training[EB/OL].[2022-04-02].https://arxiv.org/abs/1805.09300.
[22] DAI X Y,CHEN Y P,XIAO B,et al.Dynamic head:unifying object detection heads with attentions[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2021:7369-7378.
[23] XU M D,ZHANG Z,HU H,et al.End-to-end semi-supervised object detection with soft teacher[EB/OL].[2022-04-02].https://arxiv.org/abs/2106.09018.
[24] CHEN Y Z,NIU K L,ZENG Z F,et al.A wavelet based deep learning method for underwater image super resolution reconstruction[J].IEEE Access,2020,8:117759-117769.
[25] FU C Y,LIU W,RANGA A,et al.DSSD:deconvolutional single shot detector[EB/OL].[2022-04-02].https://arxiv.org/abs/1701.06659.
[26] LIN W H,ZHONG J X,LIU S,et al.RoIMiX:proposal-fusion among multiple images for underwater object detection[C]//Proceedings of 2020 IEEE International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2020:2588-2592.

Please choose a citation manager

Content to export