基于反馈机制与空洞卷积的道路小目标检测网络

doi:10.19678/j.issn.1000-3428.0063575

计算机工程 ›› 2023, Vol. 49 ›› Issue (1): 287-294. doi: 10.19678/j.issn.1000-3428.0063575

基于反馈机制与空洞卷积的道路小目标检测网络

窦允冲^1,3, 侯进^1,3, 曾雷鸣^2,3, 陈子锐^1,3

1. 西南交通大学信息科学与技术学院智能感知智慧运维实验室, 成都 611756;
2. 西南交通大学计算机与人工智能学院, 成都 611756;
3. 西南交通大学综合交通大数据应用技术国家工程实验室, 成都 611756

收稿日期:2021-12-20 修回日期:2022-02-24 发布日期:2022-03-21
作者简介:窦允冲(1996-),男,硕士研究生,主研方向为目标检测、信号与信息处理;侯进(通信作者),副教授、博士;曾雷鸣、陈子锐,硕士研究生。
基金资助:
四川省科技计划项目（2020SYSY0016）。

Road Small Target Detection Network Based on Feedback Mechanism and Dilated Convolution

DOU Yunchong^1,3, HOU Jin^1,3, ZENG Leiming^2,3, CHEN Zirui^1,3

1. Laboratory of Intelligent Preception and Smart Operation & Maintenance, School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611756, China;
2. School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, China;
3. National Engineering Laboratory of Comprehensive Transportation Big Data Application Technology, Southwest Jiaotong University, Chengdu 611756, China

Received:2021-12-20 Revised:2022-02-24 Published:2022-03-21

摘要/Abstract

摘要： 随着卷积神经网络与特征金字塔的发展，目标检测在大、中目标上取得了突破，但对于小目标存在漏检、检测精度低等问题。在YOLOv4算法的基础上进行改进，提出YOLOv4-RF算法，进一步提高模型对小目标的检测性能。使用空洞卷积替换YOLOv4中Neck部分的池化金字塔，在网络更深处减少语义丢失的同时获得更大的感受野。在此基础上，对主干网络进行轻量化并增加特征金字塔到主干网络的反馈机制，对来自浅层与深层融合的特征再次处理，保留更多小目标的特征信息，提高网络分类和定位的有效性。鉴于小目标物体属于困难检测样本，引入Focal Loss损失函数，增大困难样本的损失权重，形成YOLOv4-RF算法。在KITTI数据集上的实验数据表明，YOLOv4-RF在各个类别上的检测精度均高于YOLOv4，并在模型缩小138 MB的基础上提高了1.4%的平均精度均值（MAP@0.5）。

关键词: 小目标检测, YOLOv4算法, 空洞卷积, 反馈机制, 递归特征金字塔

Abstract: With the development of Convolutional Neural Network(CNN) and feature pyramids, target detection has made breakthroughs in large and medium targets, but there are missed detections and low detection accuracies for small targets.Aiming at the reasons for less information of small targets in the picture and the difference in the size of small targets from that of large targets, this study proposes the YOLOv4-RF algorithm based on the YOLOv4 algorithm and further enhances the detection performance of the model for small targets.This study uses dilated convolution to replace the pooled pyramid of the neck in YOLOv4 to reduce semantic loss and obtain a larger receptive field in the deeper part of the network.Moreover, the backbone network is lightweight and a feedback mechanism from the feature pyramid to the backbone network is added.The features from shallow and deep fusion are processed again, which retains more feature information of small targets and improves the effectiveness of the network classification and positioning.Finally, because the small target object belongs to the difficult detection sample, the focal loss function is introduced to increase the weight loss of the difficult sample and form the YOLOv4-RF algorithm.The experimental data on the KITTI dataset show that the detection accuracy of YOLOv4-RF in each category is higher than that of YOLOv4, and the Mean Average Precision(MAP@0.5) is improved by 1.4% by reducing the model by 138 MB.

Key words: small target detection, YOLOv4 algorithm, dilated convolution, feedback mechanism, recursive feature pyramid

中图分类号:

TP18

窦允冲, 侯进, 曾雷鸣, 陈子锐. 基于反馈机制与空洞卷积的道路小目标检测网络[J]. 计算机工程, 2023, 49(1): 287-294.

DOU Yunchong, HOU Jin, ZENG Leiming, CHEN Zirui. Road Small Target Detection Network Based on Feedback Mechanism and Dilated Convolution[J]. Computer Engineering, 2023, 49(1): 287-294.

https://www.ecice06.com/CN/Y2023/V49/I1/287

图/表 13

20230701181634

20230701181638

20230701181641

20230701181644

20230701181648

20230701181651

20230701181654

20230701181658

20230701181701

20230701181704

20230701181709

20230701181713

20230701181716

参考文献

[1] 刘颖, 刘红燕, 范九伦, 等.基于深度学习的小目标检测研究与应用综述[J].电子学报, 2020, 48(3):590-601. LIU Y, LIU H Y, FAN J L, et al.A survey of research and application of small object detection based on deep learning[J].Acta Electronica Sinica, 2020, 48(3):590-601.(in Chinese)
[2] GHIASI G, LIN T Y, LE Q V.NAS-FPN:learning scalable feature pyramid architecture for object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:7029-7038.
[3] LI X, LAI T, WANG S.Weighted feature pyramid networks for object detection[C]//Proceedings of ISPA/BDCloud/SustainCom/SocialCom 2019.Washington D.C., USA:IEEE Press, 2019:1500-1504.
[4] SINGH B, DAVIS L S.An analysis of scale invariance in object detection-SNIP[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:3578-3587.
[5] LIU S, QI L, QIN H F, et al.Path aggregation network for instance segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:8759-8768.
[6] LIN T Y, GOYAL P, GIRSHICK R, et al.Focal loss for dense object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2):318-327.
[7] LI X, WANG W H, HU X L, et al.Generalized focal loss V2:learning reliable localization quality estimation for dense object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2021:11627-11636.
[8] YANG X, YAN J C, FENG Z M, et al.R3Det:refined single-stage detector with feature refinement for rotating object[J].Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(4):3163-3171.
[9] GIRSHICK R, DONAHUE J, DARRELL T, et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2014:580-587.
[10] GIRSHICK R.Fast R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:1440-1448.
[11] 文韬, 周稻祥, 李明.Mask R-CNN中特征不平衡问题的全局信息融合方法[J].计算机工程, 2021, 47(3):256-260, 268. WEN T, ZHOU D X, LI M.Global information fusion method for feature imbalance problem in Mask R-CNN[J]. Computer Engineering, 2021, 47(3):256-260, 268.(in Chinese)
[12] CAI Z W, VASCONCELOS N.Cascade R-CNN:delving into high quality object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:6154-6162.
[13] CAO J L, CHOLAKKAL H, ANWER R M, et al.D2Det:towards high quality object detection and instance segmentation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:11482-11491.
[14] REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once:unified, real-time object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:779-788.
[15] REDMON J, FARHADI A.YOLO9000:better, faster, stronger[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:6517-6525.
[16] REDMON J, FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2021-12-10].https://arxiv.org/abs/1804.02767.
[17] LIU W, ANGUELOV D, ERHAN D.SSD:single shot multibox detector[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2016:21-37.
[18] GE Z, LIU S T, WANG F, et al.YOLOX:exceeding YOLO series in 2021[EB/OL].[2021-12-10].https://arxiv.org/abs/2107.08430.
[19] BOCHKOVSKIY A, WANG C Y, LIAO H Y M.YOLOv4:optimal speed and accuracy of object detection[EB/OL].[2021-12-10].https://arxiv.org/abs/2004.10934.
[20] WANG C Y, LIAO H Y, WU Y H, et al.CSPNet:a new backbone that can enhance learning capability of CNN[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.Washington D.C., USA:IEEE Press, 2020:1571-1580.
[21] HE K, ZHANG X, REN S.Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//Proceedings of European Conference on Computer Vision.Berlin, Germany:Springer, 2014:346-361.
[22] QIAO S Y, CHEN L C, YUILLE A.DetectoRS:detecting objects with recursive feature pyramid and switchable atrous convolution[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2021:10213-10218.
[23] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2015:770-778.
[24] XIE S N, GIRSHICK R, DOLLÁR P, et al.Aggregated residual transformations for deep neural networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:5987-5995.
[25] HUANG G, LIU Z, VAN DER MAATEN L, et al.Densely connected convolutional networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:2261-2269.
[26] CHEN L C, PAPANDREOU G, KOKKINOS I, et al.DeepLab:semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848.
[27] GEIGER A, LENZ P, URTASUN R.Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2012:3354-3361.

选择文件类型/文献管理软件名称

选择包含的内容

基于反馈机制与空洞卷积的道路小目标检测网络

Road Small Target Detection Network Based on Feedback Mechanism and Dilated Convolution

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	陈晓玉, 沈晨, 沈阅, 孔德明. 基于改进SwiftNet的堆场图像实时分割网络[J]. 计算机工程, 2024, 50(6): 296-303.
[2]	蒋心璐, 陈天恩, 王聪, 赵春江. 大田环境下的农业害虫图像小目标检测算法[J]. 计算机工程, 2024, 50(1): 232-241.
[3]	圣文顺, 余熊峰, 林佳燕, 陈欣. 融合注意力与特征金字塔的小尺度目标检测算法[J]. 计算机工程, 2024, 50(1): 242-250.
[4]	杨瑞君, 秦晋京, 程燕. 基于生成对抗网络的自然场景低照度增强模型[J]. 计算机工程, 2024, 50(1): 279-288.
[5]	李嘉新, 侯进, 盛博莹, 周宇航. 基于改进YOLOv5的遥感小目标检测网络[J]. 计算机工程, 2023, 49(9): 256-264.
[6]	孙龙, 张荣芬, 刘宇红, 饶庭漓. 监控视角下密集人群口罩佩戴检测算法[J]. 计算机工程, 2023, 49(9): 313-320.
[7]	杨长沛, 廖列法. 基于门控空洞卷积特征融合的中文命名实体识别[J]. 计算机工程, 2023, 49(8): 85-95.
[8]	刘志浩, 孟凡云, 王金鹤, 张楠. 基于空洞卷积与注意力模块的立体匹配算法[J]. 计算机工程, 2023, 49(8): 223-231.
[9]	吴珊, 周凤. 基于改进SSD算法的小目标检测[J]. 计算机工程, 2023, 49(7): 179-188.
[10]	谢虹, 姜文刚. RRA-InceptionV3结合鲁棒稀疏表示的表情识别方法[J]. 计算机工程, 2023, 49(7): 196-203.
[11]	谌雨章, 黄逸姿, 张钧涵. 基于多速率空洞卷积的多尺度水下小目标检测[J]. 计算机工程, 2023, 49(6): 257-264.
[12]	潘乐, 李弼程, 万旺, 曾荣燊. 基于关系强度理论与反馈机制的信息传播动态网络表示[J]. 计算机工程, 2023, 49(2): 246-253.
[13]	胡清翔, 饶文碧, 熊盛武. 面向无人机遥感场景的轻量级小目标检测算法[J]. 计算机工程, 2023, 49(12): 169-177.
[14]	王楷, 韩笑, 朱华吉, 缪祎晟, 吴华瑞. 基于YOLACT-RFX模型的穴盘甘蓝苗株分割算法[J]. 计算机工程, 2023, 49(12): 214-223.
[15]	曹健, 陈怡梅, 李海生, 蔡强. 基于深度学习的道路小目标检测综述[J]. 计算机工程, 2023, 49(10): 1-12.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于反馈机制与空洞卷积的道路小目标检测网络

Road Small Target Detection Network Based on Feedback Mechanism and Dilated Convolution

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献

相关文章 15

编辑推荐

Metrics

本文评价