Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

A Vehicle Detection Algorithm Based on Frequency-Domain Enhancement and Adaptive Multi-Scale Feature Fusion

  

  • Published:2026-04-20

基于频域增强与多尺度特征自适应融合的车辆检测算法

Abstract: In complex road traffic scenarios, vehicle detection faces significant challenges, including large variations in object scale, frequent occlusions, and the difficulty of simultaneously achieving high accuracy and real-time performance. To address these issues, An improved vehicle detection algorithm, termed YOLOv13n-FCM, based on the YOLOv13n baseline was improved. First, Frequency Dynamic Convolution (FDConv) is introduced into the backbone to strengthen the modeling capability of multi-frequency information, thereby enhancing the representation of vehicle edge structures and fine-grained details. Second, a Channel–Spatial Fusion (CSF) module is designed to jointly model channel-wise and spatial features, enabling the network to focus on salient vehicle regions while effectively suppressing background interference in complex scenes. Finally, a Multi-Branch Fusion (MBF) module is incorporated into the detection head to perform adaptive, weighted multi-scale feature fusion, further improving the detection performance for vehicles at different scales. The experimental results on the public datasets Vehicle Detection Dataset and BITVehicle show that the YOLOv13n-FCM model achieves good detection performance in various road vehicle scenarios. Specifically, on the Vehicle Detection Dataset, the mAP50 reaches 60.1%, and the mAP50:95 reaches 42.6%, which are 2.7% and 2.6% higher than those of the original YOLOv13n model, respectively; at the same time, compared with the best competing method, it has improved by 2.7% and 1.8% respectively. On the BITVehicle, the proposed method also outperforms the baseline model, indicating its certain cross-scenario adaptability. In addition, after hardware acceleration on an NVIDIA Jetson AGX Orin edge device, YOLOv13n-FCM runs at 78.5 FPS with an input resolution of 640×640. Overall, the proposed method substantially improves detection accuracy while maintaining real-time performance, demonstrating strong practicality for engineering applications.

摘要: 在复杂道路交通场景中,车辆目标检测面临目标尺度变化大、遮挡频繁以及检测精度与实时性难以兼顾等问题。为此,以YOLOv13n为基准模型,提出一种改进的车辆检测算法YOLOv13n-FCM。首先,在骨干网络中引入频率动态卷积(Frequency Dynamic Convolution,FDConv),通过增强网络对多频率特征的建模能力,提升模型对车辆边缘结构与细节特征的表达能力;随后,设计通道—空间特征融合(Channel–Spatial Fusion,CSF)模块,对通道维度与空间维度特征进行联合建模,引导网络更加关注关键车辆区域,有效抑制复杂背景干扰;最后,引入多分支特征融合(Multi-Branch Fusion,MBF)模块,实现多尺度特征的自适应加权融合,增强模型对不同尺度车辆目标的检测能力。在公开数据集Vehicle Detection Dataset和BITVehicle上的实验结果表明,YOLOv13n-FCM模型在不同道路车辆场景下均取得了较好的检测效果。其中,在Vehicle Detection Dataset上,mAP50达到60.1%,mAP50:95达到42.6%,较原始YOLOv13n模型分别提升2.7%和2.6%;同时,相较最优对比方法分别提升2.7%和1.8%。在BITVehicle上,所提方法同样优于基线模型,表明其具有一定的跨场景适应能力。此外,在边缘移动设备NVIDIA Jetson AGX Orin上经过硬件加速后,输入尺寸640×640下,推理速度达78.5FPS。由此可见,该模型在保证实时检测性能的同时显著提升了车辆检测精度,具有良好的工程应用价值。