基于改进YOLOv3的红外目标检测方法

doi:10.19678/j.issn.1000-3428.0060518

计算机工程 ›› 2022, Vol. 48 ›› Issue (3): 211-219. doi: 10.19678/j.issn.1000-3428.0060518

基于改进YOLOv3的红外目标检测方法

秦鹏^1,2,3, 唐川明^1,2,3, 刘云峰^1,2, 张建林^1,2, 徐智勇^1,2

1. 中国科学院光束控制重点实验室, 成都 610209;
2. 中国科学院光电技术研究所, 成都 610209;
3. 中国科学院大学电子电气与通信工程学院, 北京 100049

收稿日期:2021-01-07 修回日期:2021-03-09 发布日期:2021-03-24
作者简介:秦鹏(1996-),男,硕士研究生,主研方向为深度学习、红外目标检测;唐川明,硕士研究生;刘云峰(通信作者),副研究员、博士;张建林,研究员、博士;徐智勇,研究员。
基金资助:
国家科技委创新项目（G158207）。

Infrared Target Detection Method Based on Improved YOLOv3

QIN Peng^1,2,3, TANG Chuanming^1,2,3, LIU Yunfeng^1,2, ZHANG Jianlin^1,2, XU Zhiyong^1,2

1. Key Laboratory of Beam Control, Chinese Academy of Sciences, Chengdu 610209, China;
2. Institute of Optics and Electronics, Chinese Academy of Sciences, Chengdu 610209, China;
3. School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China

Received:2021-01-07 Revised:2021-03-09 Published:2021-03-24

摘要/Abstract

摘要： 针对红外场景中行人、车辆等目标识别率低且存在复杂背景干扰的问题，提出一种基于Effi-YOLOv3模型的红外目标检测方法。将轻量高效的EfficientNet骨干网络与YOLOv3网络相结合，提升网络模型的运行速度。通过模拟人类视觉的感受野机制，引入改进的感受野模块，在几乎不增加计算量的情况下大幅增强网络有效感受野。基于可变形卷积和动态激活函数构建DBD和CBD结构，提升模型特征编码的灵活性，扩大模型容量。选择兼顾预测框与真值框中心点距离、重叠率和长宽比偏差的CIoU作为损失函数，更好地反映预测框与真值框的重叠程度，加快预测框回归速度。实验结果表明，该方法在FLIR数据集上的平均精度均值达到70.8%，Effi-YOLOv3模型参数量仅为YOLOv3模型的33.3%，对于红外场景中的目标检测效果更优。

关键词: YOLOv3模型, 红外目标检测, 复杂背景, 可变形卷积, 动态激活函数

Abstract: To improve the low recognition rate of persons and cars in infrared scenes and solve the problem of complex background interference, an infrared target detection method based on Effi-YOLOv3 is proposed.This method combines the lightweight and efficient EfficientNet backbone network and the YOLOv3 network to improve the training speed of the model.By simulating the receptive field mechanism of human vision, an improved Receptive Field Block(RFB) is introduced to significantly increase the effective receptive field of the network while increasing computation by a small amount.Then, based on deformable convolution and dynamic activation functions, DBD and CBD structures are constructed to improve the flexibility of model feature coding and increase network model capacity.Finally, the CIoU, which takes into account the distance between the center points of the prediction box and the ground truth box and the overlap ratio and deviation of aspect ratio between them, is selected as the loss function.This better reflects the degree of overlap between the prediction box and the ground truth box and accelerates the regression speed of the prediction box.The experimental results show that the mean Average Precision(mAP) of the proposed method on the FLIR dataset reaches 70.8%, the parameter quantity of the Effi-YOLOv3 model is only 33.3% of the YOLOv3 model, and the detection effect of infrared targets is significantly improved.

Key words: YOLOv3 model, infrared target detection, complex background, deformable convolution, dynamic activation function

中图分类号:

TP391

秦鹏, 唐川明, 刘云峰, 张建林, 徐智勇. 基于改进YOLOv3的红外目标检测方法[J]. 计算机工程, 2022, 48(3): 211-219.

QIN Peng, TANG Chuanming, LIU Yunfeng, ZHANG Jianlin, XU Zhiyong. Infrared Target Detection Method Based on Improved YOLOv3[J]. Computer Engineering, 2022, 48(3): 211-219.

http://www.ecice06.com/CN/Y2022/V48/I3/211

图/表 14

20220331202616

20220331202619

20220331202622

20220331202625

20220331202628

20220331202631

20220331202634

20220331202637

20220331202641

20220331202646

20220331202649

20220331202652

20220331202656

20220331202700

参考文献

[1] GUPTA A, GUPTA U.Real time target detection for infrared images[C]//Proceedings of the 4th International Conference on Inventive Systems and Control.Washington D.C., USA:IEEE Press, 2020:570-574.
[2] 张汝榛, 张建林, 祁小平, 等.复杂场景下的红外目标检测[J].光电工程, 2020, 47(10):128-137. ZHANG R Z, ZHANG J L, QI X P, et al.Infrared target detection and recognition in complex scene[J].Opto-Electronic Engineering, 2020, 47(10):128-137.(in Chinese)
[3] LIU R M, LU Y H, GONG C L, et al.Infrared point target detection with improved template matching[J].Infrared Physics & Technology, 2012, 55(4):380-387.
[4] ZHANG G Y, LI B, LUO J, et al.A self-adaptive wildfire detection algorithm with two-dimensional Otsu optimization[EB/OL].[2020-12-08].http://www.researchgate.net/publication/343665237_A_Self-Adaptive_Wildfire_Detection_Algorithm_with_Two-Dimensional_Otsu_Optimization/download.
[5] YIN J L, LIU L, LI H, et al.The infrared moving object detection and security detection related algorithms based on W4 and frame difference[J].Infrared Physics & Technology, 2016, 77:302-315.
[6] GIRSHICK R, DONAHUE J, DARRELL T, et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2014:580-587.
[7] GIRSHICK R.Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:1440-1448.
[8] REN S Q, HE K M, GIRSHICK R, et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[9] LIU W, ANGUELOV D, ERHAN D, et al.SSD:single shot multibox detector[M].Berlin, Germany:Springer, 2016:21-37.
[10] REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once:unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:779-788.
[11] REDMON J, FARHADI A.YOLO9000:better, faster, stronger[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:6517-6525.
[12] REDMON J, FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2020-12-08].http://arxiv.org/pdf/1804.02767.
[13] LEE E J, KO B C, NAM J Y.Recognizing pedestrian's unsafe behaviors in far-infrared imagery at night[J].Infrared Physics & Technology, 2016, 76:261-270.
[14] HERRMANN C, RUF M, BEYERER J.CNN-based thermal infrared person detection by domain adaptation[EB/OL].[2020-12-08].https://www.researchgate.net/publication/324935012_CNN-based_thermal_infrared_person_detection_by_domain_adaptation.
[15] WEI X W, DENG J H, XU S, et al.Fast recognition of infrared targets based on CNN[C]//Proceedings of the 3rd International Conference on Electronic Information Technology and Computer Engineering.Washington D.C., USA:IEEE Press, 2019:909-914.
[16] HU X D, WANG X Q, YANG X, et al.An infrared target intrusion detection method based on feature fusion and enhancement[J].Defense Technology, 2020, 16(3):737-746.
[17] TAN M X, LE Q V.EfficientNet:rethinking model scaling for convolutional neural networks[EB/OL].[2020-12-08].https://arxiv.org/abs/1905.11946.
[18] LIU S T, HUANG D, WANG Y H.Receptive field block net for accurate and fast object detection[M].Berlin, Germany:Springer, 2018.
[19] DAI J F, QI H Z, XIONG Y W, et al.Deformable convolutional networks[C]//Proceedings of 2017 IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2017:764-773.
[20] ZHU X Z, HU H, LIN S, et al.Deformable ConvNets v2:more deformable, better results[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:9300-9308.
[21] GLOROT X, BORDES A, BENGIO Y.Deep sparse rectifier neural networks[C]//Proceedings of the 14th International Conference on Artificial Intelligence and Statistics.Washington D.C., USA:IEEE Press, 2011:315-323.
[22] CHEN Y P, DAI X Y.Dynamic ReLU[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2020:351-367.
[23] REZATOFIGHI H, TSOI N, GWAK J, et al.Generalized intersection over union:a metric and a loss for bounding box regression[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:658-666.
[24] ZHENG Z H, WANG P, LIU W, et al.Distance-IoU loss:faster and better learning for bounding box regression[C]//Proceedings of 2020 AAAI Conference on Artificial Intelligence.PaloAlto, USA:AAAI Press, 2020:12993-13000.

选择文件类型/文献管理软件名称

选择包含的内容

基于改进YOLOv3的红外目标检测方法

Infrared Target Detection Method Based on Improved YOLOv3

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	李强龙, 周新文, 位梦恩, 甘阳洲. 基于条形池化和注意力机制的街道场景红外目标检测算法[J]. 计算机工程, 2023, 49(8): 310-320.
[2]	刘振国, 李钊, 宋滕滕, 何益智. 结合可变形卷积与双边网格的立体匹配网络[J]. 计算机工程, 2022, 48(12): 241-247,254.
[3]	尚佳童, 雷涛, 张栋, 杜晓刚, 翟钰杰. 面向刻蚀图像分割的轻量可变形编解码网络[J]. 计算机工程, 2022, 48(12): 203-211,217.
[4]	李富豪, 赵希梅. 基于D-Unet神经网络的鼻腔鼻窦肿瘤分割算法[J]. 计算机工程, 2022, 48(1): 281-287.
[5]	朱灵灵, 高超, 陈福才. 基于轻量级卷积神经网络的人脸检测算法[J]. 计算机工程, 2021, 47(7): 273-280.
[6]	包俊, 刘宏哲. 融合可变形卷积网络的鱼眼图像目标检测[J]. 计算机工程, 2021, 47(4): 248-255.
[7]	喻清挺, 喻维超, 喻国平. 基于改进R-FCN的交通标志检测[J]. 计算机工程, 2021, 47(12): 285-290,298.
[8]	黄凤琪, 陈明, 冯国富. 基于可变形卷积的改进YOLO目标检测算法[J]. 计算机工程, 2021, 47(10): 269-275,282.
[9]	张强, 张勇, 刘芝国, 周文军, 刘佳慧. 基于改进YOLOv3的手势实时识别方法[J]. 计算机工程, 2020, 46(3): 237-245,253.
[10]	翟强, 王陆洋, 殷保群, 彭思凡, 邢思思. 基于尺度自适应卷积神经网络的人群计数算法[J]. 计算机工程, 2020, 46(2): 250-254,261.
[11]	李军,程健. 复杂背景图像下基于边缘点校验的圆检测方法[J]. 计算机工程, 2018, 44(3): 259-263.
[12]	马超,沈微,董景峰. 复杂背景中一种特定运动目标检测与跟踪方法[J]. 计算机工程, 2015, 41(5): 219-223.
[13]	王金云, 周晖杰, 纪政. 复杂背景中的人脸识别技术研究[J]. 计算机工程, 2013, 39(8): 196-199,203.
[14]	王阿川, 曹琳, 曹军. 基于改进轮廓模型的单板缺陷图像快速识别[J]. 计算机工程, 2013, 39(4): 22-26,35.
[15]	魏坤，刘密歌. 核空间与二次相关滤波器融合的红外目标检测[J]. 计算机工程, 2013, 39(11): 163-168.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于改进YOLOv3的红外目标检测方法

Infrared Target Detection Method Based on Improved YOLOv3

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献

相关文章 15

编辑推荐

Metrics

本文评价