Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

A Multi-Scale Object Detection Algorithm Oriented to Autonomous Driving

  

  • Published:2025-10-30

面向自动驾驶的多尺度目标检测算法

Abstract: Object detection for autonomous driving perception aims to locate and identify traffic participants such as motor vehicles, non-motor vehicles, and pedestrians within onboard camera views in real time, providing accurate input for the environmental perception module to support decision-making and control in autonomous driving systems. The perception system suffers from false and missed detection rates due to complex road backgrounds, diverse object shapes, and large scale variations. Specific challenges include low accuracy in detecting deformed objects, insufficient multi-scale detection, and weak global perception. To address these issues, an improved algorithm named YOLOv8-DDL based on YOLOv8n is proposed. First, deformable attention is introduced to improve the C2f module in the backbone network, which dynamically learns feature offsets to enhance the capture capability for various object shapes in traffic scenes, improving the model's adaptability to complex spatial distributions and effectively reducing false detections. Second, large separable kernel attention is integrated to enhance the spatial pyramid pooling fast module, expanding the receptive field through large-kernel convolution to strengthen global context modeling and robustness in complex backgrounds. Finally, a dynamic multi-scale adaptive fusion module and a dynamic feature pyramid network are designed to reconstruct the neck network, dynamically fusing high-level and low-level features to enhance multi-scale feature representation and improve multi-scale object detection performance. Experimental results on the public SODA10M dataset show that compared to YOLOv8n, YOLOv8-DDL improves precision, recall, F1-score, and mean average precision by 5.9%, 1.3%, 3%, and 1.5%, respectively. Additional validation on the public BDD100K dataset confirms improvements of 2%, 0.6%, 1%, and 2% in these metrics, respectively.

摘要: 面向自动驾驶感知的道路目标检测旨在实时定位与识别车载视觉范围内的机动车、非机动车及行人等交通参与者,为环境感知模块提供精准输入,支撑自动驾驶系统的决策与控制。由于道路场景背景复杂、目标形态多样且尺度差异大,导致感知系统的误检率与漏检率较高。针对形变目标检测精度低、多尺度目标检测不足、全局感知能力弱的问题,提出基于YOLOv8n的改进算法YOLOv8-DDL。首先,引入可变形注意力DAttention改进骨干网络的C2f,通过动态学习特征偏移,增强对交通场景中多种形态目标的捕捉能力,提升模型对复杂空间分布的适应性,有效减少错检。其次,融合大核可分离注意力LSKA改进SPPF,通过大核卷积扩大感受野,增强模型的全局上下文建模能力,提升复杂背景下的鲁棒性。最后,设计动态多尺度自适应融合模块DMAF及动态特征金字塔网络Dynamic-FPN重构颈部网络,通过动态融合高低层特征,增强模型在多尺度特征融合中的表达能力,改善多尺度目标检测效果。在公开数据集SODA10M上进行实验,结果表明,相较YOLOv8n,YOLOv8-DDL在精确率P、召回率R、F1值、平均精确度mAP@0.5上分别提升了5.9%、1.3%、3%、1.5%;在公开数据集BDD100K上进行辅助验证,精确率P、召回率R、F1值、平均精确度mAP@0.5分别提升了2%、0.6%、1%、2%。