VD-YOLOv11: Target Detection Algorithm for UAV Images Based on Improved YOLOv11

doi:10.19678/j.issn.1000-3428.0252920

Abstract

Abstract: Existing methods for detecting small targets in UAV applications suffer from limitations in feature representation and fusion capabilities, struggling to effectively handle complex backgrounds and small-scale objects due to challenges such as low pixel density, significant size variations, and susceptibility to background interference. To address these issues, VD-YOLOv11, an improved algorithm tailored for drone-captured scenes, is proposed. First, a Multi-Scale Feature Enhancement (MSFE) module augments the model’s perception of tiny objects by incorporating multi-scale contextual information and an edge detail reinforcement mechanism. Second, a Multi-Scale Feature Fusion (MSFF) module enhances small object representation through hierarchical integration of semantic and spatial features, improving detection accuracy in complex backgrounds and multi-scale scenarios. Additionally, a Receptive-Field Attention Head (RFAHead) enables dynamic interaction across multi-level features and adaptive allocation of receptive field weights, employing an attention-guided mechanism to refine focus on fine-grained small object regions. Finally, a dedicated small object detection layer is integrated with an optimized neck network, supplemented by an additional detection head to mitigate feature loss and strengthen recognition capability. Experimental results demonstrate that VD-YOLOv11 achieves 42.1% mAP50 on the VisDrone2019 dataset, surpassing the baseline YOLOv11n by 7.4%. On the PDT dataset, it achieves a mAP50 of 94.8% with a computational cost of 19.1 GFLOPs and 3.3M parameters. VD-YOLOv11 achieves an effective balance in detection accuracy, computational complexity, and model size, validating its effectiveness and practicality for UAV-based small object detection.

摘要： 针对无人机小目标检测任务中小目标像素少、目标尺度差异大、易受背景干扰等问题，现有方法在特征表达和融合能力上存在不足，难以有效处理复杂背景和小尺度目标。为此，本文提出了一种改进的无人机小目标检测算法——VD-YOLOv11。首先，设计了多尺度特征增强模块（MSFE,Multi-Scale Feature Enhancement），通过引入多尺度上下文信息与边缘细节强化机制，有效增强了模型对微小目标特征的感知能力。其次，提出了多尺度特征融合模块（MSFF,Multi-Scale Feature Fusion），通过整合不同层级的语义与空间信息，有效增强了小目标的特征表示能力，提升了模型在复杂背景与尺度变化场景下的检测精度。同时，构建了感受野注意力检测头（RFAHead,Receptive-Field Attention Head），实现了多层特征之间的动态交互与感受野权重的自适应分配，引入了有效的注意力引导机制，使网络更精准地聚焦于细粒度的小目标检测区域。最后，设计了小目标检测层，并与改进的颈部网络进行融合，在头部引入一个额外的检测头，减小小目标特征的损失，增强网络对小目标的识别能力。实验结果表明，VD-YOLOv11在VisDrone2019数据集上mAP50为42.1%，较基线算法YOLOv11n提升了7.4%，在PDT数据集上mAP50为94.8%，浮点计算量为19.1GFLOPs，模型参数量为3.3M；在检测精度、计算复杂度和模型规模等方面取得了有效平衡，展现出VD-YOLOv11在无人机视角小目标检测任务中的有效性和实用性。

Wang Hongyu, Cui Mingzhu, Cheng Li, Luo Weili, Dang Zheng, Shi Hanqi, Ye Hongyuan, Zhao Jintao. VD-YOLOv11: Target Detection Algorithm for UAV Images Based on Improved YOLOv11[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0252920.

王红雨, 崔明珠, 成莉, 罗威丽, 党正, 石涵琦, 叶鸿源, 赵锦涛. VD-YOLOv11：基于改进YOLOv11的无人机航拍图像目标检测算法[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0252920.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0252920

References

[1] CHAKRABARTY S, CHATTERJEE R, CHAKRABORTY S, et al. Drones in Defense: Real-Time Vision-Based Military Target Surveillance and Tracking[C]. 2025 3rd International Conference on Intelligent Systems, Advanced Computing and Communication (ISACC). Silchar, India: IEEE, 2025: 508-513.
[2] HANIF M, HATANAKA T. Real-Time Adaptation of Drone Altitude and Object Detection Model for Moving Target Tracking[C]. 2024 SICE Festival with Annual Conference (SICE FES). Kochi City, Japan: IEEE, 2024: 533-538.
[3] ANDREWS C. Smart warfare[J]. Engineering & Technology, 2012, 7(6): 56-59.
[4] MAHAJAN A, GUPTA A, GUPTA A, et al. Revolutionizing Logistics:" Drone Package Delivery Systems, a Multi-Technology Approach"[C]. 2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI). Gwalior, India: IEEE, 2024, 2: 1-6.
[5] BAFILA D, SINGH R. Enhancing Precision Agriculture: Computer vision-aided farm boundary detection and crop land identification from drone-captured image[C]. 2024 International Conference on Computing, Sciences and Communications (ICCSC). Ghaziabad, India: IEEE, 2024: 1-6.
[6] 钟映春,张文祥,王波,等. 电力巡检无人机自主降落的引导系统与策略[J]. 光学精密工程,2022,30(11): 1362-1373. ZHONG Y C, ZHANG W X, WANG B, et al. Navigation system and strategies for electric inspecting UAV autonomously landing[J]. Optics and Precision Engineering, 2022, 30(11): 1362-1373.
[7] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, NV, USA: IEEE, 2016: 779-788.
[8] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. Honolulu, HI, USA: IEEE, 2017: 7263-7271.
[9] REDMON J, FARHADI A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
[10] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.
[11] LI C, LI L, JIANG H, et al. YOLOv6: A single-stage object detection framework for industrial applications[J]. arXiv preprint arXiv:2209.02976, 2022.
[12] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Vancouver, British Columbia, Canada: IEEE, 2023: 7464-7475.
[13] WANG C Y, YEH I H, MARK LIAO H Y. Yolov9: Learning what you want to learn using programmable gradient information[C]. European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 1-21.
[14] Wang A, Chen H, Liu L, et al. Yolov10: Real-time end-to-end object detection[J]. Advances in Neural Information Processing Systems, 2024, 37: 107984-108011.
[15] TIAN Y, YE Q, DOERMANN D. Yolov12: Attention-centric real-time object detectors[J]. arXiv preprint arXiv:2502.12524, 2025.
[16] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Amsterdam, The Netherlands: Springer International Publishing, 2016: 21-37.
[17] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. Columbus, Ohio, USA: IEEE, 2014: 580-587.
[18] Girshick R. Fast r-cnn[C]. Proceedings of the IEEE international conference on computer vision. Santiago, Chile: IEEE, 2015: 1440-1448.
[19] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 39(6): 1137-1149.
[20] Yang X, Yang J, Yan J, et al. Scrdet: Towards more robust detection for small, cluttered and rotated objects[C]. Proceedings of the IEEE/CVF international conference on computer vision. Columbus, OH, USA: IEEE, 2019: 8232-8241.
[21] Li C, Yang T, Zhu S, et al. Density map guided object detection in aerial images[C]. proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. Seattle, WA, USA: IEEE, 2020: 190-191.
[22] Xiao Y, Xu T, Yu X, et al. A Lightweight Fusion Strategy with Enhanced Inter-layer Feature Correlation for Small Object Detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024. 62: 1-11.
[23] Wang X, Li W, Guo W, et al. SPB-YOLO: An efficient real-time detector for unmanned aerial vehicle images[C]. 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). Bordeaux, France: IEEE, 2021: 099-104.
[24] Zhu X, Lyu S, Wang X, et al. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]. Proceedings of the IEEE/CVF international conference on computer vision. Montreal, Canada: IEEE, 2021: 2778-2788.
[25] Zhang Y, Ye M, Zhu G, et al. FFCA-YOLO for small object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1-15.
[26] SONG Q, LIU S, DAI K, et al. YOLOv11-DEC: An improved yolov11 model for uav detection in complex contexts[C]. 2025 28th International Conference on Computer Supported Cooperative Work in Design (CSCWD). Compiegne, France: IEEE, 2025: 2404-2409.
[27] DEWANGAN B, SRINIVAS M. LIGHT-YOLOv11: An efficient small object detection model for uav images[C]. 2025 IEEE 14th International Conference on Communication Systems and Network Technologies (CSNT). Bhopal, India: IEEE, 2025: 557-563.
[28] 贺智轩, 陈里里, 王翔, 等. DMF-YOLOv11:基于改进YOLOv11n的无人机航拍图像目标检测算法[J]. 计算机工程与应用, 2025, 61(14): 88-100. HE Z X, CHEN L L, WANG X, et al. DMF-YOLOv11:Target Detection Algorithm for UAV Images Based on Improved YOLOv11n[J]. Computer Engineering and Applications, 2025, 61(14): 88-100.
[29] 涂育智, 王法翔, 吴春霖. 融合多注意力机制的轻量级无人机航拍小目标检测模型[J]. 计算机工程与应用, 2025, 61(11): 93-104. TU Y Z, WANG F X, WU C L. A Lightweight UAV Aerial Small Object Detection Model Integrating Multi-Attention Mechanisms[J]. Computer Engineering and Applications, 2025, 61(11): 93-104.
[30] 罗显志, 汪航. 跨尺度特征融合的无人机小目标检测算法[J]. 计算机工程与应用, 2025, 61(14): 135-147. LUO X Z, WANG H. Small Target Detection Algorithm for UAV Based on Cross-Scale Feature Fusion[J]. Computer Engineering and Applications, 2025, 61(14): 135-147.
[31] Zhang X, Liu C, Yang D, et al. RFAConv: Innovating spatial attention and standard convolutional operation[J]. arXiv preprint arXiv:2304.03198, 2023.
[32] Du D, Zhu P, Wen L, et al. VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results[C]. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE, 2019: 213-226.
[33] Zhou M, Xing R, Han D, et al. PDT: Uav target detection dataset for pests and diseases tree[C]. European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 56-72.
[34] Lin T-Y, Goyal P, Girshick R, et al. Focal Loss for Dense Object Detection[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2020, 42(02): 318-327.
[35] Tan M, Pang R, Le Q V. EfficientDet: Scalable and efficient object detection[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 10781-10790.
[36] Li X, Wang W, Wu L, et al. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection[J]. Advances in Neural Information Processing Systems, 2020, 33: 21002-21012.
[37] Feng C, Zhong Y, Gao Y, et al. Tood: Task-aligned one-stage object detection[C]. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE Computer Society. 2021: 3490-3499.
[38] Yang C, Huang Z, Wang N. QueryDet: Cascaded sparse query for accelerating high-resolution small object detection[C]. Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 2022: 13668-13677.
[39] Wei Z, Zhang T, Sun X, et al. Similar Category Enhancement Network for Discrimination on Small Object Detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 1-18.
[40] Hou X, Liu M, Zhang S, et al. Salience DETR: Enhancing detection transformer with hierarchical salience filtering refinement[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2024: 17574-17583.
[41] Zhao W, Deng X, Gao F, et al. Position-DETR: Step-by-step position-guided small object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5641214.
[42] Peng H, Xie H, Liu H, et al. LGFF-YOLO: small object detection method of UAV images based on efficient local–global feature fusion[J]. Journal of Real-Time Image Processing, 2024, 21(5): 167.
[43] Zhang Y, Ye M, Zhu G, et al. FFCA-YOLO for Small Object Detection in Remote Sensing Images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 189.
[44] Yu W, Zhang J, Liu D, et al. An effective and lightweight full-scale target detection network for UAV images based on deformable convolutions and multi-scale contextual feature optimization[J]. Remote Sensing, 2024, 16(16): 2944.
[45] Qian X, Zhang B, He Z, et al. IPS-YOLO: Iterative pseudo-fully supervised training of YOLO for weakly supervised object detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5630414.
[46] Zhang C, Gu C, Duan Q, et al. SOE-YOLO: A Small Object Enhancement Detection Network[J]. IEEE Sensors Journal, 2025, 25(13): 24849-24862.
[47] Guo Z, Goh H H, Li X, et al. WeedNet-R: a sugar beet field weed detection algorithm based on enhanced RetinaNet and context semantic fusion[J]. Frontiers in Plant Science, 2023, 14: 1226329.

Please choose a citation manager

Content to export