作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (12): 294-303. doi: 10.19678/j.issn.1000-3428.0070027

• 图形图像处理 • 上一篇    下一篇

基于YOLOv8的小目标检测模型的优化

王国明, 贾代旺*()   

  1. 安徽理工大学计算机科学与工程学院, 安徽 淮南 232001
  • 收稿日期:2024-06-21 修回日期:2024-08-11 出版日期:2025-12-15 发布日期:2024-10-21
  • 通讯作者: 贾代旺
  • 基金资助:
    国家自然科学基金青年基金(62102003); 安徽省大学生创新创业基金(S202310361157); 安徽省大学生创新创业基金(S202310361161)

Optimization of Small Object Detection Model Based on YOLOv8

WANG Guoming, JIA Daiwang*()   

  1. School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan 232001, Anhui, China
  • Received:2024-06-21 Revised:2024-08-11 Online:2025-12-15 Published:2024-10-21
  • Contact: JIA Daiwang

摘要:

深度学习在目标检测领域的广泛应用显著提升了对大中目标的检测能力。然而, 针对小目标检测, 由于其固有的尺度小、背景复杂等挑战, 传统的目标检测算法常常会出现漏检、误检。为了提高小目标检测的精度, 对YOLOv8模型进行研究。首先, 将主干部分的卷积模块替换为RFAConv模块, 增强了模型对于复杂图像的处理能力; 其次, 在Neck部分引入混合局部通道注意力(MLCA)机制, 能够在保持计算效率的同时, 帮助模型更高效地融合不同层次的特征; 再次, 将YOLOv8的Detect头替换为Detect_FASFF头, 以解决不同特征尺度间的一致性问题, 并增强模型对小目标的检测能力; 最后, 将完全交并比(CIoU)损失函数替换为Focaler-IoU损失函数, 使模型更关注难以精确定位的小目标。实验结果显示: 改进后的模型在小目标稀疏的FloW-Img数据集上mAP@0.5提高了4.8百分点, mAP@0.5:0.95提高了3.0百分点; 在小目标密度高的VisDrone2019数据集上, mAP@0.5提升了5.9百分点, mAP@0.5:0.95提高了4.0百分点。同时还在低空数据集AU-AIR以及行人密集检测数据集WiderPerson上做了泛化对比实验。结果表明, 优化后的模型相比较原模型在小目标检测精度上有显著提升, 且适用范围更广。

关键词: 深度学习, YOLOv8网络模型, 小目标检测, 注意力机制, 损失

Abstract:

Deep learning-based object detection has significantly improved the detection of medium and large targets. However, when detecting small objects, traditional algorithms often face challenges such as missed detections and false positives owing to the inherent issues of small scale and complex backgrounds. Therefore, this study aims to enhance the accuracy of small object detection by improving the YOLOv8 model. First, the convolutional module in the backbone is replaced with the RFAConv module, which enhances the ability of the model to process complex images. Second, a Mixed Local Channel Attention (MLCA) mechanism is introduced in the neck part, allowing the model to fuse features from different layers more efficiently while maintaining computational efficiency. Third, the Detect head of YOLOv8 is replaced with the Detect_FASFF head to address the inconsistency between different feature scales and improve the ability of the model to detect small objects. Finally, the Complete Intersection over Union (CIoU) loss function is replaced with the Focaler-IoU loss function, enabling the model to focus more on small objects that are difficult to locate precisely. Experimental results show that the improved model increases mAP@0.5 by 4.8 percentage points and mAP@0.5:0.95 by 3.0 percentage points on the FloW-Img dataset, which is sparse in small objects. On the VisDrone2019 dataset which has a high density of small objects, mAP@0.5 increases by 5.9 percentage points and mAP@0.5:0.95 improves by 4.0 percentage points. In addition, generalization comparison experiments are conducted on the low-altitude dataset AU-AIR and the pedestrian-dense detection dataset WiderPerson. The optimized model significantly improves the accuracy of small object detection compared with the original model and expands its applicability.

Key words: deep learning, YOLOv8 network model, small object detection, attention mechanism, loss