Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

MS-ADFF: A multi-scale aggregation-diffusion feature fusion algorithm for pedestrian detection on waste unloading platform

  

  • Published:2026-03-18

MS-ADFF: 融合多尺度聚扩特征的垃圾卸料平台行人检测

Abstract: Pedestrian detection on unloading platforms in waste incineration power plants remains challenging due to complex lighting interference and significant variations in pedestrian scales. Existing pedestrian detection methods exhibit limitations in shallow edge feature extraction, multi-scale feature fusion, and lightweight detection head design. To address these issues, this paper proposes a pedestrian detection model named MS-ADFF, which is based on multi-scale aggregation-diffusion feature fusion. Firstly, an edge feature enhancement module is developed. By reinforcing contour information within shallow features, this module effectively mitigates the adverse impact of image detail blurring under complex lighting conditions. Secondly, a multi-scale aggregation-diffusion feature fusion network is constructed, performing two rounds of feature aggregation and diffusion operations on the P3, P4, and P5 feature levels, which effectively integrates multi-scale semantic features through aggregation and diffusion mechanisms, thereby enhancing the model’s capability to perceive pedestrians targets of different scales. Finally, a lightweight shared detection head constructed using deep convolution and group convolution is proposed, which replaces the traditional dual-branch structure with a shared feature extraction mechanism, effectively suppressing redundant parameters while maintaining detection accuracy. Experimental results show that, with YOLOv11s as the baseline model, the proposed MS-ADFF model achieves a detection accuracy of 92.7% on the self-built WIPPID dataset, with Recall and mAP@0.5 improved by 4.6% and 1.5% respectively compared to the baseline model, while reducing 0.7 GFLOPs in floating-point operations. On the public CityPersons dataset, the MS-ADFF model improves detection precision by 1.9% over the baseline model, with a reduction of 0.7 GFLOPs. These results demonstrate that, under the condition of overall floating-point operations being lower than those of the baseline model, the proposed method effectively enhances pedestrian detection accuracy in unloading platforms of waste incineration power plants, while also exhibiting strong generalization ability and robustness in street-scene pedestrian detection tasks.

摘要: 针对垃圾焚烧电站卸料平台场景中存在的复杂光照干扰、行人尺度差异显著等问题,现有行人检测方法在浅层边缘特征提取、多尺度特征融合和检测头轻量化设计等方面存在不足。为此,提出一种融合多尺度聚扩特征的行人检测模型(MS-ADFF)。首先,设计边缘特征增强模块,通过强化浅层特征中行人轮廓特征信息,有效降低复杂光照环境下图像细节模糊对行人目标检测的影响;其次,构建多尺度聚扩融合网络,对P3、P4和P5尺度层特征进行两次特征聚扩操作,通过特征聚合与扩散机制有效融合多尺度语义特征,增强模型对不同尺度行人目标的感知能力;最后,构建由深度卷积和分组卷积构成的轻量化共享检测头,通过共享特征提取机制替代传统双分支结构,在保证检测精度的同时有效抑制参数冗余。实验结果表明,以YOLOv11s为基线模型,在自建数据集WIPPID上MS-ADFF模型达到了92.7%的检测精度,Recall、mAP@0.5分别较基线模型提升了4.6%和1.5%,浮点运算量减少了0.7 GFLOPs;在公开数据集CityPersons上MS-ADFF模型的检测精度较基线模型提升了1.9%,浮点运算量减少了0.7 GFLOPs。证明该模型在整体浮点运算量低于基线模型的条件下,能够有效提升垃圾焚烧电站卸料平台场景下的行人检测精度,同时在街道场景下的行人检测任务中表现出良好的泛化能力和鲁棒性。