面向无人机遥感场景的轻量级小目标检测算法

doi:10.19678/j.issn.1000-3428.0066677

摘要/Abstract

摘要：

在基于深度学习的目标检测算法中，YOLO算法因兼具速度与精度的优势而备受关注，但是将其应用于无人机遥感领域时存在检测速度较慢、计算资源要求较高、小目标检测精度不佳等问题。为此，提出基于YOLO的轻量级小目标检测算法SS-YOLO。使用轻量的主干网络提升算法的推理速度，根据特征金字塔网络分治思想，加入下采样倍数为4的高分辨特征图P2用于检测微小目标。为解决高分辨率特征图（P2、P3）中语义信息不足的问题，构建结合自适应融合因子的语义增强上采样模块。针对定位损失函数中IoU度量方法对目标尺寸敏感所带来的影响小目标定位精确性的问题，设计结合归一化Wasserstein距离度量方法与中心点距离惩罚项的L_CNWD定位回归损失函数。实验结果表明，与YOLOv5s以及最新的YOLOv7-tiny相比，改进后的SS-YOLO模型参数量分别减少了31.3%和20.6%，与YOLOv7-tiny相比，mAP在VisDrone与AI-TOD数据集上分别提升了7.5和7.0个百分点；与YOLOv5s相比，mAP分别提升了2.3和3.6个百分点。当输入图片尺寸为800×800像素时，SS-YOLO的FPS为110帧/s，能够在满足无人机等边缘设备实时检测的同时，显著提升小目标的检测结果。

关键词: 小目标检测, YOLO网络, 轻量级网络, 双向特征金字塔, 定位损失函数

Abstract:

In the field of deep learning, particularly in object detection algorithms, the YOLO algorithm stands out for its speed and accuracy. However, its application in Unmanned Aerial Vehicle(UAV)remote sensing encounters challenges such as slow detection speed, high computational demands, and decreased accuracy in detecting small objects. To overcome these limitations, this paper introduces SS-YOLO, a lightweight variant of YOLO optimized for small object detection. SS-YOLO utilizes a lightweight backbone network to enhance the algorithm's inference speed. It adopts the divide-and-conquer approach of the Feature Pyramid Network(FPN) and integrates a high-resolution feature map, P2, with a downsampling factor of four, specifically for small target detection. The paper also proposes a semantic enhancement upsampling module combined with adaptive fusion factors to address the semantic information deficiency in high-resolution feature maps(P2, P3). Moreover, SS-YOLO features an innovative L_CNWD localization regression loss function. This function merges the Normalized Wasserstein Distance(NWD) measurement method with a center point distance penalty term. This integration effectively addresses the sensitivity of the Intersection over Union(IoU) measurement method to target size in the localization loss function, which impacts the accuracy of small target localization. Experimental results indicate that SS-YOLO surpasses YOLOv5s and YOLOv7-tiny in efficiency. It reduces the parameter count by 31.3% and 20.6% respectively, compared to these models. On the VisDrone and AI-TOD datasets, SS-YOLO shows an increase of 7.5 and 7.0 percentage points in mean Average Precision(mAP), respectively, when compared to YOLOv7-tiny. Against YOLOv5s, the mAP increases by 2.3 and 3.6 percentage points, respectively. Notably, with an input image size of 800×800 pixels, SS-YOLO achieves a Frames Per Second(FPS) of 110 frame/s, demonstrating its capability to significantly improve the detection of small objectts while meeting the real-time detection requirements of edge devices such as drones.

Key words: small object detection, YOLO network, lightweight network, bidirectional feature pyramid, localization loss function

胡清翔, 饶文碧, 熊盛武. 面向无人机遥感场景的轻量级小目标检测算法[J]. 计算机工程, 2023, 49(12): 169-177.

Qingxiang HU, Wenbi RAO, Shengwu XIONG. Lightweight Small Object Detection Algorithm for UAV Remote Sensing Scene[J]. Computer Engineering, 2023, 49(12): 169-177.

http://www.ecice06.com/CN/Y2023/V49/I12/169

图/表 17

图1 YOLOv5网络模型结构

Fig.1 Structure of YOLOv5 network model

图2 本文的改进思路示意图

Fig.2 Schematic diagram of improvement ideas in this paper

图3 SS-YOLO网络模型结构

Fig.3 Structure of SS-YOLO network model

图4 原始特征提取网络模型结构

Fig.4 Structure of original feature extraction network model

图5 增加极小尺寸后的特征提取网络模型结构

Fig.5 Structure of feature extraction network model with minimal size addition

图6 亚像素卷积像素重排过程

Fig.6 Rearrangement process of subpixel convolution pixel

图7 SUCA模块结构

Fig.7 Structure of SUCA module

图8 加入SUCA模块后特征提取网络模型结构

Fig.8 Structure of feature extraction network model with SUCA module

图9 IoU对目标尺寸的敏感性

Fig.9 Sensitivity of IoU to object size

图10 尺寸为（12，32，64）的目标在相同偏移量下的IoU与NWD对比

Fig.10 IoU and NWD comparison of object with size (12, 32, 64) at the same offset

图11 混淆矩阵

Fig.11 Confusion matrix

图12 不同算法的检测结果对比

Fig.12 Comparison of detection results among different algorithms

参考文献 29

1	CHEN C Y, LIU M Y, TUZEL O, et al. R-CNN for small object detection[C]//Proceedings of Asian Conference on Computer Vision. Berlin, Germany: Springer, 2017: 214-230.
2	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Scaled-YOLOv4: scaling cross stage partial network[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 1-10.
3	CUI C, GAO T Q, WEI S Y, et al. PP-LCNet: a lightweight CPU convolutional neural network[EB/OL]. [2022-12-01]. https://arxiv.org/abs/2109.15099v1.
4	BAI Y C, ZHANG Y Q, DING M L, et al. SOD-MTGAN: small object detection via multi-task generative adversarial network[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 210-226.
5	NA B, FOX G C. Object detection by a super-resolution method and a convolutional neural networks[C]//Proceedings of International Conference on Big Data. Washington D. C., USA: IEEE Press, 2018: 1-10.
6	AKYON F C, ONUR ALTINUC S, TEMIZEL A. Slicing aided hyper inference and fine-tuning for small object detection[C]//Proceedings of International Conference on Image Processing. Washington D. C., USA: IEEE Press, 2022: 1-10.
7	BOCHKOVSKIY A, WANG C Y, LIAO H. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2022-12-01]. https://arxiv.org/abs/2004.10934.
8	SINGH B, DAVIS L S. An analysis of scale invariance in object detection-SNIP[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 1-10.
9	WOO S, HWANG S, KWEON I S. StairNet: top-down semantic aggregation for accurate one shot detection[C]//Proceedings of Winter Conference on Applications of Computer Vision. Washington D. C., USA: IEEE Press, 2018: 1-10.
10	郑秋梅, 王璐璐, 王风华. 基于改进卷积神经网络的交通场景小目标检测. 计算机工程, 2020, 46(6): 26- 33. URL
	ZHENG Q M, WANG L L, WANG F H. Small object detection in traffic scene based on improved convolutional neural network. Computer Engineering, 2020, 46(6): 26- 33. URL
11	戚玲珑, 高建瓴. 基于改进YOLOv7的小目标检测. 计算机工程, 2023, 49(1): 41- 48. URL
	QI L L, GAO J L. Small target detection based on improved YOLOv7. Computer Engineering, 2023, 49(1): 41- 48. URL
12	ZHU X K, LYU S C, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//Proceedings of IEEE/CVF International Conference on Computer Vision Workshops. Washington D. C., USA: IEEE Press, 2021: 1-10.
13	LIM J S, ASTRID M, YOON H J, et al. Small object detection using context and attention[C]//Proceedings of International Conference on Artificial Intelligence in Information and Communication. Washington D. C., USA: IEEE Press, 2021: 1-10.
14	HAN J M, DING J A, XUE N, et al. ReDet: a rotation-equivariant detector for aerial object detection[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 1-10.
15	YU D H, XU Q, GUO H T, et al. Anchor-free arbitrary-oriented object detector using box boundary-aware vectors. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15, 2535- 2545. doi: 10.1109/JSTARS.2022.3158905
16	WANG G B, DING H W, YANG Z J, et al. TRC-YOLO: a real-time detection method for lightweight targets based on mobile devices. IET Computer Vision, 2022, 16(2): 126- 142. doi: 10.1049/cvi2.12072
17	CHEN Q A, WANG Y M, YANG T, et al. You only look one-level feature[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 1-10.
18	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 1-10.
19	FENG C J, ZHONG Y J, GAO Y, et al. TOOD: task-aligned one-stage object detection[C]//Proceedings of International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 1-10.
20	WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation[C]//Proceedings of Winter Conference on Applications of Computer Vision. Washington D. C., USA: IEEE Press, 2018: 1-10.
21	GONG Y Q, YU X H, DING Y, et al. Effective fusion factor in FPN for tiny object detection[C]//Proceedings of Winter Conference on Applications of Computer Vision. Washington D. C., USA: IEEE Press, 2021: 1-10.
22	XU C, WANG J W, YANG W, et al. Detecting tiny objects in aerial images: a normalized Wasserstein distance and a new benchmark. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190, 79- 93. doi: 10.1016/j.isprsjprs.2022.06.002
23	LIU Y C, SHAO Z R, TENG Y Y, et al. NAM: normalization-based attention module [EB/OL]. [2022-12-01]. https://arxiv.org/abs/2111.12419v1.
24	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2023: 1-10.
25	CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 1-10.
26	WANG C Y, YEH I H, LIAO H P. You only learn one representation: unified network for multiple tasks[EB/OL]. [2022-12-01]. https://arxiv.org/abs/2105.04206.
27	GE Z, LIU S T, WANG F, et al. YOLOx: exceeding YOLO series in 2021[EB/OL]. [2022-12-01]. https://arxiv.org/abs/2107.08430.
28	GUPTA S, GUPTA M K. A comprehensive data-level investigation of cancer diagnosis on imbalanced data. Computational Intelligence, 2022, 38(1): 156- 186. doi: 10.1111/coin.12452
29	DUAN K W, BAI S, XIE L X, et al. CenterNet: keypoint triplets for object detection[C]//Proceedings of International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2019: 1-10.

[1]	李嘉新, 侯进, 盛博莹, 周宇航. 基于改进YOLOv5的遥感小目标检测网络[J]. 计算机工程, 2023, 49(9): 256-264.
[2]	吴珊, 周凤. 基于改进SSD算法的小目标检测[J]. 计算机工程, 2023, 49(7): 179-188.
[3]	谌雨章, 黄逸姿, 张钧涵. 基于多速率空洞卷积的多尺度水下小目标检测[J]. 计算机工程, 2023, 49(6): 257-264.
[4]	陈文轩, 曾碧, 郭植星. 融合多特征与语义图卷积网络的摔倒检测方法[J]. 计算机工程, 2023, 49(5): 277-285,294.
[5]	李刚, 邵瑞, 周鸣乐, 李敏, 万洪林. 基于注意力的轻量级工业产品缺陷检测网络[J]. 计算机工程, 2023, 49(11): 275-283.
[6]	曹健, 陈怡梅, 李海生, 蔡强. 基于深度学习的道路小目标检测综述[J]. 计算机工程, 2023, 49(10): 1-12.
[7]	窦允冲, 侯进, 曾雷鸣, 陈子锐. 基于反馈机制与空洞卷积的道路小目标检测网络[J]. 计算机工程, 2023, 49(1): 287-294.
[8]	戚玲珑, 高建瓴. 基于改进YOLOv7的小目标检测[J]. 计算机工程, 2023, 49(1): 41-48.
[9]	伍子嘉, 陈航, 彭勇, 宋威. 动态环境下融合轻量级YOLOv5s的视觉SLAM[J]. 计算机工程, 2022, 48(8): 187-195,205.
[10]	柳聪, 屈丹, 司念文, 魏紫薇. 基于深度可分离卷积的轻量级图像超分辨率重建[J]. 计算机工程, 2022, 48(6): 228-234.
[11]	邹慧海, 侯进. 改进SSD算法的道路小目标检测研究[J]. 计算机工程, 2022, 48(5): 281-288.
[12]	候瑞环, 杨喜旺, 王智超, 高佳鑫. 一种基于YOLOv4-TIA的林业害虫实时检测方法[J]. 计算机工程, 2022, 48(4): 255-261.
[13]	胡宗承, 周亚同, 史宝军, 何昊. 结合注意力机制与特征融合的静态手势识别算法[J]. 计算机工程, 2022, 48(4): 240-246.
[14]	吴旭, 刘翔, 赵静文. 一种轻量级多尺度融合的图像篡改检测算法[J]. 计算机工程, 2022, 48(2): 224-229,236.
[15]	郭克友, 贺成博, 王凯迪, 王苏东, 李雪, 张沫. COVID‐19疫情下基于YOLOv4的安全社交距离风险评估[J]. 计算机工程, 2022, 48(10): 28-36.

选择文件类型/文献管理软件名称

选择包含的内容