Air-to-Ground Dense Small Target Detection Algorithm Based on Multi-scale Feature Fusion

doi:10.19678/j.issn.1000-3428.0252144

Abstract

Abstract: UAV low-altitude aerial photography technology has been widely used in a variety of fields, such as data acquisition, security monitoring, and terrain mapping, and has significantly improved operational efficiency. However, since aerial images usually cover a wide area, the detection targets are often small in size, dense and unevenly distributed, which limits the accuracy of target detection. To address the above problems, a target detection algorithm TOD-YOLO is proposed based on an improved YOLOv8 network. First, an RFAConv downsampling module is introduced in the backbone network part, which provides effective attention to the convolutional kernel without increasing the number of parameters, allowing the network to focus on smaller targets in the feature extraction phase. Secondly, the feature fusion module CSPOKM and a new feature fusion path are proposed to increase the focus on small targets and improve the performance of the feature fusion network. Then, based on the depth separable convolution DWConv and EMA attention mechanism, the lightweight attention detection head DWA-Head is designed to reduce the number of parameters while improving the accuracy of detection. Finally, in order to solve the problem that CIoU is easy to misdetect and miss detection in complex scenes with high density of small targets, DA-MPDIoU, a loss function that can dynamically adjust the positive and negative sample coefficients, is designed to assign higher weights to small target samples that are easy to miss and difficult to classify, and to optimize the training results. Experimental results show that compared with the original YOLOv8 algorithm, the improved algorithm improves mAP@0.5 and mAP@0.5:0.95 on the VisDrone2019 dataset by 9.6 percentage points and 6.8 percentage points respectively, and further generalization experiments are carried out on the DOTA dataset. The experimental results show that the algorithm in this paper exhibits significant advantages and potentials in small target detection tasks.

摘要： 无人机低空航拍技术在数据采集、安全监控以及地形测绘等多个领域中得到了广泛的应用，显著提高了作业效率。然而在无人机航拍目标检测中，检测目标往往存在尺寸较小，密集且分布不均等问题。为此提出一种基于改进YOLOv8网络的目标检测算法TOD-YOLO。首先，在主干网络部分引入RFAConv下采样模块，在增加较少参数量的情况下给卷积核提供了有效的注意力，使得网络在特征提取阶段聚焦于更小的目标。其次，提出特征融合模块CSPOKM以及一种新的特征融合路径，提高模型对小目标特征信息的提取能力，提升特征融合网络的性能。然后，设计轻量级注意力检测头DWA-Head替代了原有的检测头，降低参数量的同时提升了模型的目标检测精度。最后，提出可动态调节正负样本系数的损失函数DA-MPDIoU，为易漏检、难分类的小目标样本分配更高的权重，解决了CIoU在小目标密度大的复杂场景下容易产生误检和漏检的问题。实验结果显示，改进的算法在VisDrone2019数据集上的mAP@0.5和mAP@0.5:0.95指标分别提高了9.6个百分点和6.8个百分点，并进一步在DOTA数据集上进行泛化实验，充分验证了算法在小目标检测任务上有明显的优势和潜力。

Deng Yuhui, DengYueming, He Xin. Air-to-Ground Dense Small Target Detection Algorithm Based on Multi-scale Feature Fusion[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0252144.

邓宇辉, 邓月明, 何鑫. 基于多尺度特征融合的空对地密集小目标检测算法[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0252144.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0252144

References

[1] 江波,屈若锟,李彦冬,等.基于深度学习的无人机航拍目标检测研究综述[J].航空学报,2021,42(04):137-151. Jiang B, Qu R K, Li Y D, et al. Object detection in UAV imagery based on deep learning: Review [J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 524519. [2] Redmon J. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. [3] Jocher, G. (2020). YOLOv5 by Ultralytics (Version 7.0) [Computer software]. https:// doi.org/10.5281/zenodo.3908559. [4] Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arxiv preprint arxiv:2004.10934, 2020. [5] Ge Z. Yolox: Exceeding yolo series in 2021[J]. arxiv preprint arxiv:2107.08430, 2021. [6] 王国明,贾代旺.基于YOLOv8的小目标检测模型的优化[J/OL].计算机工程,1-10[2025-02-12].https://doi.org/10.19678/j.issn.1000-3428.0070027. Wang G M, Jia D W. The Optimization of Small Object Detection Model Based on YOLOv8 [J/OL]. Computer Engineering, 1-10 [2025-02-12].https://doi.org/10.19678/j.issn.1000-3428.0070027. Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]// Proceedings of the European conference on computer vision (ECCV). 2018: 3-19 [7] Li Y, Fan Q, Huang H, et al. A modified YOLOv8 detection network for UAV aerial image recognition[J]. Drones, 2023, 7(5): 304. [8] Tan M, Pang R, Le Q V. Efficientdet: Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 10781-10790. [9] Wang G, Chen Y, An P, et al. UAV-YOLOv8: A small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios[J]. Sensors, 2023, 23(16): 7190. [10] Tong Z, Chen Y, Xu Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[J]. arxiv preprint arxiv:2301.10051, 2023. [11] 李子轩,赵志刚,张泽宇,等.基于FNB-YOLOv5的钢筋网绑扎点目标检测[J/OL].上海交通大学学报,1-24[2025-01-13]. https://d oi.org/10.16183/j.cnki.jsjtu.2024.121. Li Z X, Zhao Z G, Zhang Z Y, et al. Object Detection of Steel Mesh Binding Point Using FNB-YOLOv5[J/OL]. Journal of Shanghai Jiaotong University, 1-24[2025-01-13]. https:// doi.org/10.16183/j.cnki.jsjtu.2024.121. [12] Sun Y, Lan Z, Sun Y, et al. Ldstd: low-altitude drone aerial small target detector[J]. The Journal of Supercomputing, 2025, 81(2): 414. [13] 廖宁生,曹天秀,刘科言,等.复合特征与多尺度融合的无人机小目标检测算法[J/OL].计算机工程与应用,1-10[2025-01-18]. http://cnki. wenx.top/kcms/detail/11.2127.TP.20241023.1616.006.html. Liao N S, Cao T X, Liu K Y, er al, Small Target Detection Algorithm for UAV Based on Composite Feature and Multi-Scale Fusion[J/OL]. Computer Engineering and Applications,1-10[2025-01-18].http://cnki.wenx.top/kcms/detail/11.2127.TP.20241023.1616.006.html. [14] Zhang X, Liu C, Yang D, et al. RFAConv: Innovating spatial attention and standard convolutional operation[J]. arxiv preprint arxiv:2304.03198, 2023. [15] Sunkara R, Luo T. No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects[C]//Joint European conference on machine learning and knowledge discovery in databases. Cham: Springer Nature Switzerland, 2022: 443-45 [16] Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 13713-13722. [17] Cui Y, Ren W, Knoll A. Omni-Kernel Network for Image Restoration[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(2): 1426-1434.. [18] Zhang H, Wang Y, Dayoub F, et al. Varifocalnet: An iou-aware dense object detector[C]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 8514-8523. [19] Yang Y, Li M, Meng B, et al. Rethinking the Aligned and Misaligned Features in Onestage Object Detection[J]. arxiv preprint arxiv, 2021, 2108. [20] Howard A G. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arxiv preprint arxiv:1704.04861, 2017. [21] Jocher, G., Qiu, J., & Chaurasia, A. (2023). Ultralytics YOLO (Version 8.0.0) [Computer software]. https://github.com/ultralytics/ultralytics [22] Ouyang D, He S, Zhang G, et al. Efficient multi-scale attention module with cross-spatial learning[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023: 1-5. [23] Yu J, Jiang Y, Wang Z, et al. Unitbox: An advanced object detection network[C]// Proceedings of the 24th ACM international conference on Multimedia. 2016: 516-520. [24] Zheng Z, Wang P, Liu W, et al. Distance-IoU loss: Faster and better learning for bounding box regression[C]//Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 12993-13000. [25] Siliang M, Yong X. Mpdiou: A loss for efficient and accurate bounding box regression. arxiv 2023[J]. arxiv preprint arxiv:2307.07662. [26] Yu Z, Huang H, Chen W, et al. Yolo-facev2: A scale and occlusion aware face detector[J]. Pattern Recognition, 2024, 155: 110714. [27] Du D, Zhu P, Wen L, et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results[C]//Proceedings of the IEEE/CVF international conference on computer vision workshops. 2019: 0-0. [28] **a G S, Bai X, Ding J, et al. DOTA: A large-scale dataset for object detection in aerial images[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 3974-3983. [29] Yang G, Lei J, Zhu Z, et al. AFPN: Asymptotic feature pyramid network for object detection[C]//2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2023: 2184-2189. [30] [J/OL].电讯技术,1-11[2025-01-13]. https://doi.org/10.20079/j.i张博文,薛波.基于多尺度特征的无人机目标识别算法ssn.1001-893x.240527001. Zhang B W, Xue B. UAV Target Recognition Algorithm Based on Multi-scale Features[J/OL]. Telecommunication Engineering, 1-11[2025-01 -13].https://doi.org/10.20079/j.issn.1001-893x.240527001. [31] Lyu C, Zhang W, Huang H, et al. Rtmdet: An empirical study of designing real-time object detectors[J]. arxiv preprint arxiv:2212.07784, 2022. [32] Yang C, Huang Z, Wang N. QueryDet: Cascaded sparse query for accelerating high-resolution small object detection[C]// Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 2022: 13668-13677. [33] Lin T Y, Maire M, Belongie S, et al. Microsoft coco: Common objects in context[C]//Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer International Publishing, 2014: 740-755. [34] Wang J, Yang W, Guo H, et al. Tiny object detection in aerial images[C]//2020 25th international conference on pattern recognition (ICPR). IEEE, 2021: 3791-3798.

Please choose a citation manager

Content to export