Improved YOLO Object Detection Algorithm Based on Deformable Convolution

doi:10.19678/j.issn.1000-3428.0059096

Abstract

Abstract: The YOLO algorithm for object detection is limited by the inaccurate positioning of the boundary box and the low detection accuracy for small objects.To address the problem, an improved YOLO algorithm, dcn-YOLO, is proposed based on deformable convolution for object detection.The algorithm employs the K-means++ to cluster anchor boxes that are more in line with the size of data set, so as to reduce the impact of initial points on clustering results and speed up the convergence of network training.Then, a residual deformable convolution module, res-dcn, is constructed.Two improved dcn-YOLO algorithms are derived by embedding res-dcn in the first YOLO feature extraction head module or replacing three YOLO feature extraction head modules with res-dcn, so the network can adaptively learn the receptive field of feature points and extract more effective features for objects of different sizes and shapes, increasing the detection accuracy.Experimental results on VOC data sets show that the propose algorithm can effectively improve the object detection accuracy.Its mAP reaches 82.6%, which is 2.1 percentage points higher than that of YOLO, 5.2 percentage points higher than that of SSD and 9.4 percentage points higher than that of Faster R-CNN.

Key words: YOLO algorithm, object detection, receptive field, deformable convolution, k-means++ algorithm

摘要： 针对YOLO目标检测算法存在边界框定位不准确及对小目标检测精度低的问题，提出一种改进的YOLO目标检测算法dcn-YOLO。使用k-means++算法聚类出更符合数据集尺寸的锚盒，以降低初始点对聚类结果的影响并加快网络训练收敛速度。构建残差可变形卷积模块res-dcn，分别采用将其嵌入YOLO第一特征提取头模块中和替换3个YOLO特征提取头模块的方式，构建两种改进的dcn-YOLO算法，使网络可以自适应地学习特征点的感受野，从而对不同尺寸和形状的目标提取更有效的特征，提高检测精度。在VOC数据集上的实验结果表明，该算法能有效提高目标检测精度，mAP达到82.6%，相比YOLO、SSD、Faster R-CNN，分别高出了2.1、5.2、9.4个百分点。

关键词: YOLO算法, 目标检测, 感受野, 可变形卷积, k-means++算法

CLC Number:

TP391.41

HUANG Fengqi, CHEN Ming, FENG Guofu. Improved YOLO Object Detection Algorithm Based on Deformable Convolution[J]. Computer Engineering, 2021, 47(10): 269-275,282.

黄凤琪, 陈明, 冯国富. 基于可变形卷积的改进YOLO目标检测算法[J]. 计算机工程, 2021, 47(10): 269-275,282.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0059096

http://www.ecice06.com/EN/Y2021/V47/I10/269

Figures/Tables 14

References

[1] 李军伟, 周小龙, 产思贤, 等.基于自适应卷积神经网络特征选择的视频目标跟踪方法[J].计算机辅助设计与图形学学报, 2018, 30(2):273-281. LI J W, ZHOU X L, CHAN S X, et al.A novel video target tracking method based on adaptive convolutional neural network feature[J].Journal of Computer-Aided Design & Computer Graphics, 2018, 30(2):273-281.(in Chinese)
[2] 桑军, 郭沛, 项志立, 等.Faster-RCNN的车型识别[J].重庆大学学报, 2017, 40(7):32-36. SANG J, GUO P, XIANG Z L, et al.Vehicle detection based on faster-RCNN[J].Journal of Chongqing University, 2017, 40(7):32-36.(in Chinese)
[3] 芮挺, 费建超, 周遊, 等.基于深度卷积神经网络的行人检测[J].计算机工程与应用, 2016, 52(13):162-166. RUI T, FEI J C, ZHOU Y, et al.Pedestrian detection based on deep convolutional neural network[J].Computer Engineering and Applications, 2016, 52(13):162-166.(in Chinese)
[4] 谢林江, 季桂树, 彭清, 等.改进的卷积神经网络在行人检测中的应用[J].计算机科学与探索, 2018, 12(5):708-718. XIE L J, JI G S, PENG Q, et al.Application of preprocessing convolutional neural network in pedestrian detection[J].Journal of Frontiers of Computer Science and Technology, 2018, 12(5):708-718.(in Chinese)
[5] NAVNEET D, BILL T.Histograms of oriented gradients for human detection[C]//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2005.4-8.
[6] DAVID G.Object recognition from local scale-invariant features[C]//Proceedings of International Conference on Computer Vision.New York, USA:ACM Press, 1999:1150-1157.
[7] PAPAGEORGIOU C P, OREN M, POGGIO T.A general framework for object detection[C]//Proceedings of the 6th Computer Vision, Washington D.C., USA:IEEE Press, 1998:555-562.
[8] REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once:unified, real-time object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:779-788.
[9] REDMON J, FARHADI A.YOLO9000:better, faster, stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:6517-6525.
[10] REDMON J, FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2020-07-28].https://arxiv.org/abs/1804.02767.
[11] LIU W, ANGUELOV D, ERHAN D, et al.SSD:single shot multibox detector[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2016:21-37.
[12] FU C Y, LIU W, RANGA A, et al.DSSD:deconvolutional single shot detector[EB/OL].[2020-07-28].https://arxiv.org/abs/1701.06659.
[13] REN S Q, HE K M, GIRSHICK R, et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[14] DAI J F, LI Y, HE K M, et al.R-FCN:object detection via region-based fully convolutional networks[C]//Proceedings of 2016 Conference Advances in Neural Information Processing Systems.Barcelona, Spain:MIT Press, 2016:379-387.
[15] 姜竣, 翟东海.基于空洞卷积与特征增强的单阶段目标检测算法[J/OL].计算机工程:1-10[2020-09-27].https://doi.org/10.19678/j.issn.1000-3428.0058315. JIANG J, ZHAI D H.Single-stage object detection algorithm based on atrous convolution and feature enhancement[EB/OL].Computer Engineering:1-10[2020-09-27].https://doi.org/10.19678/j.issn.1000-3428.0058315. (in Chinese)
[16] 徐诚极, 王晓峰, 杨亚东.Attention-YOLO:引入注意力机制的YOLO检测算法[J].计算机工程与应用, 2019, 55(6):13-23. XU C J, WANG X F, YANG Y D.Attention-YOLO:YOLO detection algorithm that introduces attention mechanism[J].Computer Engineering and Applications, 2019, 55(6):13-23.(in Chinese)
[17] DAI J F, QI H Z, XIONG Y W, et al.Deformable convolutional networks[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2017:764-773.
[18] ZHU X Z, HU H, LIN S, et al.Deformable convnets v2:more deformable, better results[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:9308-9316.
[19] MARK E, LUC VAN G, CHRISTOPHER K W, et al.The pascal visual object classes challenge[J].International Journal of Computer Vision.2010, 88(2):303-338.
[20] LIN T, DOLLÁR P, GIRSHICK R, et al.Feature pyramid networks for object detection[EB/OL].[2020-07-28].https://arxiv.org/abs/1612.03144.
[21] REDMON J, FARHADI A.YOLO:real-time object detection[EB/OL].[2020-07-28].https://pjreddie.com/darknet/yolo/.
[22] 管皓, 薛向阳, 安志勇.一种利用在线卷积网络的视频目标跟踪方法[J].小型微型计算机系统, 2017, 38(4):872-875. GUAN H, XUE X Y, AN Z Y.Video object tracking via online convolutional network[J].Journal of Chinese Computer Systems, 2017, 38(4):872-875.(in Chinese)

Please choose a citation manager

Content to export