Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2026, Vol. 52 ›› Issue (4): 214-228. doi: 10.19678/j.issn.1000-3428.0070151

• Computer Vision and Image Processing • Previous Articles    

RSD-YOLO-Based Small Target Detection in UAV Aerial Images

TANG Weibo1, FANG Qiang2,3,4, LI Peigen2,3,4, AI Longjin1, XIONG Jinhong1, XIA Haiting1   

  1. 1. Faculty of Civil Aviation and Aeronautics, Kunming University of Science and Technology, Kunming 650500, Yunnan, China;
    2. Faculty of Civil Engineering and Mechanics, Kunming University of Science and Technology, Kunming 650500, Yunnan, China;
    3. International Joint Laboratory for Green Construction and Intelligent Maintenance of Yunnan Province, Kunming 650500, Yunnan, China;
    4. Yunnan Key Laboratory of Disaster Reduction in Civil Engineering, Kunming 650500, Yunnan, China
  • Received:2024-07-19 Revised:2024-09-18 Published:2024-12-05

基于RSD-YOLO的无人机航拍图像小目标检测

汤伟博1, 方强2,3,4, 李沛根2,3,4, 艾龙金1, 熊金红1, 夏海廷1   

  1. 1. 昆明理工大学民航与航空学院, 云南 昆明 650500;
    2. 昆明理工大学建筑工程学院, 云南 昆明 650500;
    3. 云南省绿色建造与智慧运维国际联合实验室, 云南 昆明 650500;
    4. 云南省土木工程防灾重点实验室, 云南 昆明 650500
  • 作者简介:汤伟博(CCF学生会员),男,硕士研究生,主研方向为目标检测;方强、李沛根,博士研究生;艾龙金、熊金红,硕士研究生;夏海廷(通信作者),教授、博士,E-mail:haiting.xia@kust.edu.cn
  • 基金资助:
    国家自然科学基金(12262015)。

Abstract: The RSD-YOLO algorithm, based on YOLOv8s, is proposed to address the challenges of low detection performance, severe occlusion, difficulty of small target feature extraction, and large number of model parameters inherent in Unmanned Aerial Vehicle (UAV) aerial images. First, the Receptive Field Attention (RFA) module CSP-RFA is designed to replace the C2f module for enhancing the capability of small target feature extraction, effectively addressing the insensitivity of traditional convolutional operations to positional changes. Second, the backbone and feature fusion networks are made lightweight, a new large-size feature map detection branch is added, and a Receptive Field Pyramid Network (RFPN) is proposed to optimize the feature flow direction and improve feature representation. Third, the detection head module is optimized by integrating multi-scale features with a multi-level attention mechanism and the loss function is updated to improve the model’s detection performance for small targets. Finally, in terms of model compression, Layer-Adaptive Magnitude-based Pruning (LAMP) algorithm is employed to further reduce the number of parameters and model size. The experimental results demonstrate that the lightweight RSD-YOLO model significantly outperforms the baseline model on the publicly available VisDrone2019 dataset, with a 10.0 percentage point increase in precision, a 9.5 percentage point increase in mAP@0.5 (equivalent to a 24.1% increase), and a 6.9 percentage point increase in mAP@0.5∶0.95 (equivalent to a 29.4% increase). The number of model parameters is reduced from 11.12×106 to 4.05×106, representing a 63.6% reduction, and the computational cost is reduced from 42.7 GFLOPs to 25.5 GFLOPs, showing a 40% reduction. Furthermore, for a newly filtered dataset focusing on small occluded targets, RSD-YOLO shows improvements of 9.1, 16.1, and 10.7 percentage points in terms of precision, mAP@0.5, and mAP@0.5∶0.95, respectively.

Key words: Unmanned Aerial Vehicle (UAV), small target detection, YOLOv8s, attention mechanism, feature fusion, occluded target detection

摘要: 针对无人机(UAV)航拍图像存在的检测性能低、遮挡严重、小目标特征提取难度大及模型参数量大的问题,提出了基于YOLOv8s的RSD-YOLO算法。首先,设计了感受野注意力(RFA)模块CSP-RFA替代C2f模块,以提升小目标特征提取能力,有效应对传统卷积操作对位置变化不敏感的问题。其次,对主干网络和特征融合网络进行了轻量化处理,新增了大尺寸特征图检测分支,并提出了感受野金字塔网络(RFPN),优化特征流动方向,增强特征表达能力。再次,检测头模块经过优化,将多尺度特征集成至具有多级注意力机制的检测头中,并替换了损失函数,提升了模型对小目标的检测性能。最后,在模型压缩方面,采用层自适应幅度剪枝(LAMP)算法,进一步减少了模型的参数量和大小。实验结果表明,轻量化后的RSD-YOLO在公开数据集VisDrone2019上较基线模型有显著提升,精度提高了10.0百分点,mAP@0.5提升9.5百分点(增幅24.1%),mAP@0.5∶0.95提高6.9百分点(增幅29.4%)。模型参数量从11.12×106减少至4.05×106(减少63.6%),计算量从42.7 GFLOPs降至25.5 GFLOPs(减少40%)。此外,在仅检测遮挡小目标的新数据集上,RSD-YOLO在精度、mAP@0.5、mAP@0.5∶0.95上分别提升了9.1、16.1和10.7百分点。

关键词: 无人机, 小目标检测, YOLOv8s, 注意力机制, 特征融合, 遮挡目标检测

CLC Number: