Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2025, Vol. 51 ›› Issue (6): 184-192. doi: 10.19678/j.issn.1000-3428.0068698

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Small Object Detection Algorithm for Aerial Photography Based on Improved YOLOv3

XI Qi, WANG Mingjie, WEI Jinghe*(), ZHAO Wei   

  1. The 58th Research Institute of China Electronics Technology Group Corporation, Wuxi 214122, Jiangsu, China
  • Received:2023-10-25 Online:2025-06-15 Published:2025-06-05
  • Contact: WEI Jinghe

基于改进YOLOv3的航拍小目标检测算法

奚琦, 王明杰, 魏敬和*(), 赵伟   

  1. 中国电子科技集团公司第五十八研究所, 江苏 无锡 214122
  • 通讯作者: 魏敬和
  • 基金资助:
    国家自然科学基金面上项目(62174150); 国家自然科学基金青年科学基金项目(62204233); 江苏省自然科学基金面上项目(BK20211041); 江苏省自然科学基金面上项目(BK20211040); 江苏省产业前瞻与关键核心技术重点项目(BE2021003-1); 江苏省产业前瞻与关键核心技术项目(BE2023005-1)

Abstract:

This study presents an improved You Only Look Once version 3 (YOLOv3) algorithm for small object detection, to address problems such as low detection precision for small objects, missed detection, and false detection in the detection process. First, in terms of network structure, the feature extraction capability of the backbone network is improved by using DenseNet-121, with a Densely Connected Network (DenseNet), to replace the original Darknet-53 network as its basic network. Simultaneously, the convolution kernel size is modified to further reduce the loss of feature map information, to enhance the robustness of the detection model against small objects. A fourth feature detection layer with a size of 104×104 pixel is added. Second, the bilinear interpolation method is used to replace the original nearest neighbor interpolation method for upsampling operations, to solve the serious feature loss problem in most detection algorithms. Finally, in terms of the loss function, Generalized Intersection over Union (GIoU) is used instead of Intersection over Union (IoU) to calculate the loss value of the boundary frame, and the Focal Loss function is introduced as the confidence loss function of the boundary frame. Experimental results show that the mean Average Precision (mAP) of the improved algorithm on the VisDrone2019 dataset is 63.3%, which is 13.2 percentage points higher than that of the original YOLOv3 detection model, and 52 frame/s on a GTX 1080 Ti device. The improved algorithm has good detection performance for small objects.

Key words: small object detection, You Only Look Once version 3 (YOLOv3), Densely Connected Network (DenseNet), loss function, Generalized Intersection over Union (GIoU)

摘要:

针对小尺度目标在检测时精确率低且易出现漏检和误检等问题, 提出一种改进的YOLOv3 (You Only Look Once version 3)小目标检测算法。在网络结构方面, 为提高基础网络的特征提取能力, 使用DenseNet-121密集连接网络替换原Darknet-53网络作为其基础网络, 同时修改卷积核尺寸, 进一步降低特征图信息的损耗, 并且为增强检测模型对小尺度目标的鲁棒性, 额外增加第4个尺寸为104×104像素的特征检测层; 在对特征图融合操作方面, 使用双线性插值法进行上采样操作代替原最近邻插值法上采样操作, 解决大部分检测算法中存在的特征严重损失问题; 在损失函数方面, 使用广义交并比(GIoU)代替交并比(IoU)来计算边界框的损失值, 同时引入Focal Loss焦点损失函数作为边界框的置信度损失函数。实验结果表明, 改进算法在VisDrone2019数据集上的均值平均精度(mAP)为63.3%, 较原始YOLOv3检测模型提高了13.2百分点, 并且在GTX 1080 Ti设备上可实现52帧/s的检测速度, 对小目标有着较好的检测性能。

关键词: 小目标检测, YOLOv3, 密集连接网络, 损失函数, 广义交并比