Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

VD-YOLOv11: Target Detection Algorithm for UAV Images Based on Improved YOLOv11

  

  • Published:2026-03-03

VD-YOLOv11:基于改进YOLOv11的无人机航拍图像目标检测算法

Abstract: Existing methods for detecting small targets in UAV applications suffer from limitations in feature representation and fusion capabilities, struggling to effectively handle complex backgrounds and small-scale objects due to challenges such as low pixel density, significant size variations, and susceptibility to background interference. To address these issues, VD-YOLOv11, an improved algorithm tailored for drone-captured scenes, is proposed. First, a Multi-Scale Feature Enhancement (MSFE) module augments the model’s perception of tiny objects by incorporating multi-scale contextual information and an edge detail reinforcement mechanism. Second, a Multi-Scale Feature Fusion (MSFF) module enhances small object representation through hierarchical integration of semantic and spatial features, improving detection accuracy in complex backgrounds and multi-scale scenarios. Additionally, a Receptive-Field Attention Head (RFAHead) enables dynamic interaction across multi-level features and adaptive allocation of receptive field weights, employing an attention-guided mechanism to refine focus on fine-grained small object regions. Finally, a dedicated small object detection layer is integrated with an optimized neck network, supplemented by an additional detection head to mitigate feature loss and strengthen recognition capability. Experimental results demonstrate that VD-YOLOv11 achieves 42.1% mAP50 on the VisDrone2019 dataset, surpassing the baseline YOLOv11n by 7.4%. On the PDT dataset, it achieves a mAP50 of 94.8% with a computational cost of 19.1 GFLOPs and 3.3M parameters. VD-YOLOv11 achieves an effective balance in detection accuracy, computational complexity, and model size, validating its effectiveness and practicality for UAV-based small object detection.

摘要: 针对无人机小目标检测任务中小目标像素少、目标尺度差异大、易受背景干扰等问题,现有方法在特征表达和融合能力上存在不足,难以有效处理复杂背景和小尺度目标。为此,本文提出了一种改进的无人机小目标检测算法——VD-YOLOv11。首先,设计了多尺度特征增强模块(MSFE,Multi-Scale Feature Enhancement),通过引入多尺度上下文信息与边缘细节强化机制,有效增强了模型对微小目标特征的感知能力。其次,提出了多尺度特征融合模块(MSFF,Multi-Scale Feature Fusion),通过整合不同层级的语义与空间信息,有效增强了小目标的特征表示能力,提升了模型在复杂背景与尺度变化场景下的检测精度。同时,构建了感受野注意力检测头(RFAHead,Receptive-Field Attention Head),实现了多层特征之间的动态交互与感受野权重的自适应分配,引入了有效的注意力引导机制,使网络更精准地聚焦于细粒度的小目标检测区域。最后,设计了小目标检测层,并与改进的颈部网络进行融合,在头部引入一个额外的检测头,减小小目标特征的损失,增强网络对小目标的识别能力。实验结果表明,VD-YOLOv11在VisDrone2019数据集上mAP50为42.1%,较基线算法YOLOv11n提升了7.4%,在PDT数据集上mAP50为94.8%,浮点计算量为19.1GFLOPs,模型参数量为3.3M;在检测精度、计算复杂度和模型规模等方面取得了有效平衡,展现出VD-YOLOv11在无人机视角小目标检测任务中的有效性和实用性。