作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

基于改进YOLOv8的景区行人检测算法

  • 发布日期:2023-12-19

Pedestrian detection algorithm for scenic spots based on improved YOLOv8

  • Published:2023-12-19

摘要: 针对当前景区行人检测存在检测精度低、算法参数量大和现有公开数据集在小目标检测上存在限制等问题,本文创建了TAPDataset行人检测数据集,填补了现有数据集在小目标检测方面的不足;并基于YOLOv8算法,提出了一种检测精度高、硬件要求低的新模型YOLOv8-L。首先,引入了DepthSepConv轻量化卷积模块,降低了模型的参数量和计算量。其次,采用BiFormer注意力机制和上采样算子CARAFE,加强了模型对图像的语义理解和信息融合能力,显著提升了模型的检测精度。最后,增加了一层小目标检测层,来提取更多的浅层特征,从而有效的改善模型对小目标的检测性能。使用TAPDataset、VOC 2007及TAP+VOC数据集验证算法的有效性。实验结果表明,与YOLOv8相比,在TAPDataset数据集上FPS基本不变的情况下,模型的参数量减少了18.06%,mAP@0.5提高了5.51%,mAP@0.5:0.95提高了6.03%;在VOC 2007数据集上,模型的参数量减少了13.6%,mAP@0.5提高了3.96%,mAP@0.5:0.95提高了6.39%;在TAP+VOC数据集上,模型的参数量减少了14.02%,mAP@0.5提高了4.49%,mAP@0.5:0.95提高了5.68%;改进后的算法具有更强的泛化性能,能够更好的适用于景区行人检测任务。

Abstract: Aiming at the problems of low detection accuracy, large number of algorithm parameters and limitations of existing public datasets on small target detection in the current scenic pedestrian detection, this paper creates the TAPDataset pedestrian detection dataset, which fills the shortcomings of the existing datasets on small target detection; and based on the YOLOv8 algorithm, it proposes a new model with high detection accuracy and low hardware requirements, the YOLOv8-L .First,the lightweight convolution module DepthSepConv is introduced to reduce the number of parameters and computation of the model. Second, the BiFormer attention mechanism and the CARAFE upsampling operator are used to strengthen the model's semantic understanding of images and information fusion ability, which significantly improves the model's detection accuracy. Finally, a small target detection layer is added to extract more shallow features, which effectively improves the model's detection performance for small targets. The effectiveness of the algorithm is verified using TAPDataset, VOC 2007 and TAP+VOC datasets. The experimental results show that compared with YOLOv8, the amount of parameters of the model is reduced by 18.06% on the TAPDataset dataset with the FPS basically unchanged, mAP@0.5 improved by 5.51% and mAP@0.5:0.95 improved by 6.03%; on the VOC 2007 dataset, the amount of parameters of the model is reduced by 13.6%, and mAP@ 0.5 improved by 3.96%, mAP@0.5:0.95 improved by6.39%; on the TAP+VOC dataset, the parameter amount of the model decreased by 14.02%, mAP@0.5 improved by 4.49%, mAP@0.5:0.95 improved by5.68%; the improved algorithm has stronger generalization performance and can be better applied to the scenic pedestrian detection task.