Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2022, Vol. 48 ›› Issue (5): 251-257. doi: 10.19678/j.issn.1000-3428.0061423

• Graphics and Image Processing • Previous Articles     Next Articles

A Multi-scale Crowd Counting Algorithm with Removing Background Interference

GUO Aixin1, XIA Yinfeng2, WANG Dawei1, LU Bin1   

  1. 1. College of Physics and Information Engineering, Shanxi Normal University, Taiyuan 030006, China;
    2. Department of Automation, University of Science and Technology of China, Hefei 230026, China
  • Received:2021-04-25 Revised:2021-05-28 Published:2021-06-01

一种抗背景干扰的多尺度人群计数算法

郭爱心1, 夏殷锋2, 王大为1, 芦宾1   

  1. 1. 山西师范大学 物理与信息工程学院, 太原 030006;
    2. 中国科学技术大学 自动化系, 合肥 230026
  • 作者简介:郭爱心(1991—),女,助教、硕士,主研方向为计算机视觉、深度学习;夏殷锋,博士研究生;王大为、芦宾,讲师、博士。
  • 基金资助:
    国家自然科学基金(62004119)。

Abstract: Crowd counting technology is aimed at estimating the number of people in crowd pictures or videos.The technology can effectively be applied to prevent stampede accidents and is widely used in security and early warning, urban planning, and management of large gatherings.However, due to crowd scale variation, background interference, uneven crowd distribution, occlusion, and perspective effect, it is still a very challenging task to count a single image.Aiming at the problem of multi-scale changes and background interference in crowd counting, a multi-scale crowd counting algorithm with removing background interference is proposed.Based on the VGG16 network structure, the feature pyramid is introduced to form the multi-scale feature fusion backbone network to solve the problem of the multi-scale changes.The Double-Head-CC structure is designed to perform foreground-background segmentation and density map prediction on the fused feature map to suppress the background interference.Based on the local correlation of the density map and multi-task learning, the multiple loss functions, and multi-task joint loss function are defined to optimize the network.The network model is trained and evaluated on the ShanghaiTech, UCF-QNRF, and JHU-CROWD++ datasets.Experimental results show that the algorithm can predict the population density distribution and number of the crowd well, with high accuracy, strong robustness, and good generalization performance.

Key words: crowd counting, deep learning, feature pyramid, loss function, density map

摘要: 人群计数技术以估计人群图片或视频中的人数为目标,可以有效预防人群踩踏事故的发生,广泛应用于安防预警、城市规划及大型集会管理等领域。然而,由于人群尺度变化、背景干扰、人群分布不均、遮挡和透视效应等因素的影响,单幅图片的人群计数仍是一项非常具有挑战性的任务。针对人群计数中多尺度变化和背景干扰问题,提出一种抗背景干扰的多尺度人群计数算法。以VGG16网络结构为基础,引入特征金字塔构建多尺度特征融合骨干网络解决人群多尺度变化问题,设计Double-Head-CC结构对融合后的特征图进行前景背景分割和密度图预测以抑制背景干扰。基于密度图的局部相关性和多任务学习,定义多重损失函数和多任务联合损失函数进行网络优化。在ShanghaiTech、UCF-QNRF和JHU-CROWD++数据集上进行训练和评测,实验结果表明,该算法能够很好地预测人群密度分布和人群数量,具有较高的准确性,且鲁棒性强、泛化性能良好。

关键词: 人群计数, 深度学习, 特征金字塔, 损失函数, 密度图

CLC Number: