作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (5): 133-142. doi: 10.19678/j.issn.1000-3428.0069026

• 人工智能与模式识别 • 上一篇    下一篇

基于改进YOLOv8的密集行人检测模型

黄昆1, 齐肇建2, 王娟敏1, 胡倩1, 胡伟超1, 皮建勇1,3   

  1. 1. 贵州大学公共大数据国家重点实验室计算机科学与技术学院, 贵州 贵阳 550025;
    2. 贵州移动信息科技有限公司, 贵州 贵阳 550001;
    3. 贵州大学云计算与物联网研究中心, 贵州 贵阳 550025
  • 收稿日期:2023-12-14 修回日期:2024-03-06 出版日期:2025-05-15 发布日期:2024-05-23
  • 通讯作者: 皮建勇,E-mail:pijianyong@139.com E-mail:pijianyong@139.com
  • 基金资助:
    贵州省科技支撑计划(黔科合支撑一般430)。

Aggregation Pedestrian Detection Model Based on Improved YOLOv8

HUANG Kun1, QI Zhaojian2, WANG Juanmin1, HU Qian1, HU Weichao1, PI Jianyong1,3   

  1. 1. College of Computer Science and Technology, State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, Guizhou, China;
    2. Guizhou Mobile Information Technology Co., Ltd., Guiyang 550001, Guizhou, China;
    3. Research Center of Cloud Computing and Internet of Things, Guizhou University, Guiyang 550025, Guizhou, China
  • Received:2023-12-14 Revised:2024-03-06 Online:2025-05-15 Published:2024-05-23

摘要: 密集行人检测是公共智能监控的关键技术,其采用目标检测方法对视频中的行人位置和数量进行检测,进而实现对视频中人群的智能监控。在人员密集场景下因遮挡和行人的目标太小造成漏检。为此,提出一种改进YOLOv8检测模型Crowd-YOLOv8。首先,在主干网络使用nostride-Conv-SPD模块,增强对图像小目标特征等细粒度信息的提取能力;其次,在YOLOv8网络的颈部引入小目标检测头和CARAFE上采样算子对各尺度特征进行融合,以提高在小目标情况下的检测效果。实验结果表明,所提模型在CrowdHuman数据集上mAP@0.5和mAP@0.5∶0.95分别取得了84.3%和58.2%的检测效果,与原YOLOv8n相比分别提高了3.7和5.2百分点;在WiderPerson数据集上取得了88.4%和67.4%,与原YOLOv8n相比提高了1.1和1.5百分点。

关键词: 密集行人检测, YOLOv8网络, nostride-Conv-SPD模块, CARAFE算子, 小目标检测头

Abstract: Pedestrian detection in crowded scenes is a key technology in intelligent monitoring of public space. It enables the intelligent monitoring of crowds, using object detection methods to detect the positions and number of pedestrians in videos. This paper presents Crowd-YOLOv8, an improved version of the YOLOv8 detection model, to address the issue of pedestrians being easily missed owing to occlusion and small target size in densely populated areas. First, nostride-Conv-SPD is introduced into the backbone network to enhance its capability of extracting fine-grained information, such as small object features in images. Second, small object detection heads and the CARAFE upsampling operator are introduced into the neck part of the YOLOv8 network to fuse features at different scales and improve the detection performance in the case of small targets. Experimental results demonstrate that the proposed method achieves an mAP@0.5 of 84.3% and an mAP@0.5∶0.95 of 58.2% on a CrowdedHuman dataset, which is an improvement of 3.7 and 5.2 percentage points, respectively, compared to those of the original YOLOv8n. On the WiderPerson dataset, the proposed method achieves an mAP@0.5 of 88.4% and an mAP@0.5∶0.95 of 67.4%, which is an improvement of 1.1 and 1.5 percentage points compared to those of the original YOLOv8n.

Key words: aggregation pedestrian detection, YOLOv8 network, nostride-Conv-SPD module, CARAFE operator, small object detection head

中图分类号: