改进YOLOv12n的航拍图像小目标检测算法

doi:10.19678/j.issn.1000-3428.0260088

摘要/Abstract

摘要： 针对无人机航拍图像中小目标像素占比低、尺度波动剧烈且分布密集的问题，提出一种基于YOLOv12n改进的算法SAM-YOLOv12n。在主干网络中设计了双注意力耦合C2f小目标模块（Dual-Attention Coupled C2f for Small Object,DA-C2f-S），通过引入多层特征提取结构与双重注意力机制，有效增强了对小目标边缘及纹理等细微特征的捕捉能力；构建了多尺度融合卷积模块（Multi-Scale Fusion Convolution,MSFConv），以膨胀深度可分离卷积（Dilated Depthwise Separable Convolution,DDSConv）为核心设计不同膨胀率的差异化分支，实现局部细节与全局上下文特征的协同建模，弥补单一尺度感受野的局限，更好适配航拍小目标的尺度波动特性；重构检测头结构，保留高分辨率分支并移除大目标检测头，使计算资源更集中于密集小目标区域。在VisDrone2019数据集上实验结果表明，改进方法在mAP@0.5和mAP@0.5:0.95上分别较基线YOLOv12n提升9.9%和7.2%，验证了其在复杂航拍场景下对小目标检测的有效性。在TinyPerson超小目标及HIT-UAV红外航拍数据集上的泛化实验，验证了改进方法在不同航拍场景下的跨域适配能力。其核心优势在于有效平衡了检测精度、模型复杂度与推理效率，可为无人机航拍目标实时检测任务提供可靠的技术支撑。

Abstract: Aiming at the problems of limited pixel resolution, significant scale variation, and dense distribution of small objects in UAV-aerial images, an improved algorithm named SAM-YOLOv12n based on YOLOv12n is proposed. In the backbone network, a Dual-Attention Coupled C2f for Small Objects (DA-C2f-S) module is designed. By introducing a multilevel feature extraction structure and a dual attention mechanism, the module effectively enhances the ability to capture fine features such as edges and textures of small objects. A Multi-Scale Fusion Convolution (MSFConv) module is constructed, which takes Dilated Depthwise Separable Convolution (DDSConv) as the core and designs differentiated branches with various dilation rates. This achieves cooperative modeling of local details and global contextual features, compensating for the limitations of a single-scale receptive field, and better adapting to the scale fluctuation characteristics of small aerial objects. Experimental results on the VisDrone2019 dataset show that the improved method achieves improvements of 9.9% in mAP@0.5 and 7.2% in mAP@0.5:0.95 compared with the baseline YOLOv12n, validating its effectiveness for small object detection in complex aerial scenarios. Generalization experiments conducted on the TinyPerson ultra-small object dataset and HIT-UAV infrared aerial dataset verify the cross-domain adaptability of the proposed method across different aerial scenes. Its core advantage lies in effectively balancing detection accuracy, model complexity, and inference efficiency, providing reliable technical support for real-time object detection tasks in UAV aerial imaging.

陈文杰, 梁银, 杜明晶, 黄尧晟, 刘妍洁. 改进YOLOv12n的航拍图像小目标检测算法[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0260088.

CHEN Wenjie, LIANG Yin, DU Mingjing, HUANG Yaosheng, LIU Yanjie. Improved YOLOv12n for Small Object Detection in Aerial Images[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0260088.

参考文献

[1] 徐彦威，李军，董元方，等. YOLO 系列目标检测算法综述 [J]. 计算机科学与探索，2024, 18 (9): 2221-2238. XU Y W, LI J, DONG Y F, et al. Survey of development of YOLO object detection algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18 (9): 2221-2238.
[2] GIRSHICK R. Fast R-CNN[C]// Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec 7-13, 2015. Piscataway: IEEE, 2015: 1440–1448.
[3] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. Advances in Neural Information Processing Systems, 2015, 28: 91–99.
[4] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Amsterdam, The Netherlands: Springer International Publishing, 2016: 21-37.
[5] GE Z. YOLOX: Exceeding YOLO series in 2021[EB/OL]. (2021-07-11) [2025-05-20]. https://arxiv.org/abs/2107.08430.
[6] WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, Jun 17-24, 2023. Piscataway: IEEE, 2023: 7464–7475.
[7] WANG A, CHEN H, LIU L H, et al. YOLOv10: Real-time end-to-end object detection[EB/OL]. (2024-05-20) [2025-05-20]. https://arxiv.org/abs/2405.14458.
[8] ZHANG Y, GAO G, CHEN Y, et al. ODD-YOLOv8: An algorithm for small object detection in UAV imagery[J]. Journal of Supercomputing, 2025, 81: 202–218.
[9] LIU X, ZHANG G, ZHOU B. An efficient feature aggregation network for small object detection in UAV aerial images[J]. Journal of Supercomputing, 2025, 81: 548–565.
[10] XU X, LI Q, PAN J, et al. ESOD-YOLO: An enhanced efficient small object detection framework for aerial images[J]. Computing, 2025, 107: 54–72.
[11] JI J, ZHAO Y, LI A, et al. Dense small object detection algorithm for unmanned aerial vehicle remote sensing images in complex backgrounds[J]. Digital Signal Processing, 2025, 158: 104938.
[12] YANG J, ZHANG X, SONG C. Research on a small target object detection method for aerial photography based on improved YOLOv7[J]. Visual Computer, 2025, 41: 3487–3501.
[13] 董一兵，曾辉，侯少杰. LMUAV-YOLOv8: 低空无人机视觉目标检测轻量化网络 [J]. 计算机工程与应用，2025, 61 (3): 94–110. DONG Y B, ZENG H, HOU S J. LMUAV-YOLOv8: Lightweight network for object detection in low-altitude UAV vision [J]. Computer Engineering and Applications, 2025, 61 (3): 94–110.
[14] 侯颖，吴琰，寇旭瑞，等。改进 YOLOv8 的无人机航拍图像小目标检测算法 [J/OL]. (2025-03-13) [2025-05-20]. https://kns.cnki.net/kcms/detail/11.2127.TP.20250313.1021.002.html https://kns.cnki.net/kcms/detail/11.2127.TP.20250313.1021.002.html. HOU Y, WU Y, KOU X R, et al. Small object detection algorithm for UAV images based on improved YOLOv8[J/OL]. (2025-03-13) [2025-05-20]. https://kns.cnki.net/kcms/detail/11.2127.TP.20250313.1021.002.html.
[15] WANG Z, SU Y, KANG F, et al. PC-YOLO11s: A lightweight and effective feature extraction method for small target image detection[J]. Sensors, 2025, 25(2): 348–365.
[16] XUAN Y, ZHANG X Y, LI C, et al. LAM-YOLOv11 for UAV transmission line inspection: Overcoming environmental challenges[J]. IEEE Transactions on Power Delivery, 2025, 40(2): 890–898.
[17] CHEN P, LIU S, FENG W, et al. Research on a small object detection method in remote sensing images based on bi-level routing attention and deformable convolution[J]. Digital Signal Processing, 2025, 160: 105045.
[18] TIAN Y J, YE Q X, DOERMANN D. YOLOv12: Attention centric real-time object detectors[EB/OL]. (2025-02-24) [2025-05-20]. https://arxiv.org/abs/2502.12524.
[19] TAO S, YANG S Q, LI H Y, GE J, DING L X, LU L D. MIS-YOLOv8: An improved algorithm for detecting small objects in UAV aerial photography based on YOLOv8 [J]. IEEE Transactions on Instrumentation and Measurement, 2025, 74: 1–12.
[20] MISRA D, NALAMADA T, ARASANIPALAI A U, HOU Q. Rotate to attend: Convolutional triplet attention module[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, Jan 3-8, 2021. Piscataway: IEEE, 2021: 3138–3147.
[21] WANG Q, WU B, ZHU P, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, Jun 13-19, 2020. Piscataway: IEEE, 2020: 11531–11539.
[22] CHOLLET F. Xception: Deep learning with depthwise separable convolutions[EB/OL]. (2016-10-07) [2025-05-20]. https://arxiv.org/abs/1610.02357.
[23] DU D, ZHU P, WEN L, et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019: 0-0.
[24] YU X, GONG Y, JIANG N, et al. Scale match for tiny person detection[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass Village, CO, USA, March 1-5, 2020. Piscataway: IEEE, 2020: 1257-1265.
[25] Suo J S, Wang T Y , Zhang X Z ,et al. HIT-UAV: A high-altitude infrared thermal dataset for unmanned aerial vehicles[J]. Scientific Data, 2023,10(1): 227-238.
[26]林世颢，张洋，高盛祥，等。基于改进 YOLOv11n 的无人机航拍图像目标检测算法 [J/OL]. 云南大学学报 (自然科学版), 2026: 1-11 [2026-02-19]. https://link.cnki.net/urlid/53.1045.n.20251216.1414.004. LIN S H, ZHANG Y, GAO S X, et al. An object detection algorithm for UAV aerial images based on improved YOLOv11n [J/OL]. Journal of Yunnan University (Natural Sciences Edition), 2026: 1-11 [2026-02-19]. https://link.cnki.net/urlid/53.1045.n.20251216.1414.004.
[27]钟帅, 王丽萍. MCS-RETR: 改进 RT-DETR 的无人机航拍图像目标检测方法[J/OL].航空学报: 1-16 [2025-08-28].https://link.cnki.net/urlid/11.1929.v.20250428.151 1.004. ZHONG S, WANG L P. MCS-RETR: Improved RT-DETR object detection method for UAV aerial images[J/OL].Acta Aeronautica et Astronautica Sinica: 1-16[2025-08-28].https://link.cnki.net/urlid/11.1929.v.20 250428.1511.004.
[28] 江宝得,张利锋,各傲祥. 融合多尺度特征的无人机航拍小目标检测算法 [J/OL]. 计算机工程, 1-15[2026-02-19]. https://doi.org/10.19678/j.issn.1000-3428.002521012. JIANG B D, ZHANG L F, GE A X. A small object detection algorithm for UAV aerial images fusing multi-scale features[J/OL]. Computer Engineering, 2026: 1-15[2026-02-19]. https://doi.org/10.19678/j.issn.1000-3428.002521012.
[29] 孙中毅，王栋，曹国刚，等. Harmony-YOLO11: 基于高频增强与特征引导的轻量级小目标检测算法 [J]. 计算机工程与应用，2026, 62:1-16. SUN Z Y, WANG D, CAO G G, et al. Harmony-YOLO11: A Lightweight Small Object Detector via High-Frequency Enhancement and Feature Guidance[J]. Computer Engineering and Applications, 2026, 62:1-16.
[30] 常高宇，赵顺祥，侯舒誉，等。基于改进 YOLOv8n 的无人机航拍小目标检测算法 [J]. 科学技术与工程，2025, 25 (31): 13500-13508. CHANG G Y, ZHAO S X, HOU S Y, et al. UAV Aerial Photography Small Target Detection Algorithm Based on Improved YOLOv8n [J]. Science Technology and Engineering, 2025, 25 (31): 13500-13508.
[31]田红鹏,李志强,杨赛. 改进RT-DETR的航拍图像小目标检测算法 [J/OL]. 计算机工程, 1-14[2026-02-19]https://doi.org/10.19678/j.issn.1000-3428.0252661. TIAN H P, LI Z Q, YANG S. An improved algorithm for small object detection in UAV aerial images based on RT-DETR[J/OL]. Computer Engineering, 2026: 1-14[2026-02-19]. https://doi.org/10.19678/j.issn.1000-3428.0252661.
[32] ZHU X, LYU S, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, October 10-17, 2021. Piscataway: IEEE, 2021: 2778-2788

选择文件类型/文献管理软件名称

选择包含的内容