Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2021, Vol. 47 ›› Issue (6): 305-311. doi: 10.19678/j.issn.1000-3428.0057878

• Development Research and Engineering Application • Previous Articles     Next Articles

Robot-Assisted Crowd Evacuation Algorithm Based on Deep Spatio-Temporal Q-network

TAN Mei, LIU Shihao, ZHOU Wan, CHEN Guowen, HU Xuemin   

  1. School of Computer Science and Information Engineering, Hubei University, Wuhan 430062, China
  • Received:2020-03-27 Revised:2020-05-12 Published:2020-06-03
  • Contact: 国家自然科学基金青年基金(61806076);湖北省自然科学基金青年基金(2018CFB158);湖北省大学生创新创业训练计划项目(S201910512026)。 E-mail:tanbella77@163.com

基于深度时空Q网络的机器人疏散人群算法

谭嵋, 刘士豪, 周婉, 陈国文, 胡学敏   

  1. 湖北大学 计算机与信息工程学院, 武汉 430062
  • 作者简介:谭嵋(1998-),女,本科生,主研方向为深度强化学习;刘士豪、周婉、陈国文,本科生;胡学敏,副教授、博士。

Abstract: The application of robots to crowd evacuation is limited by the low flexibility, low scenario adaptability, and low evacuation efficiency of robots.To address the problem, this paper proposes an algorithm for robot-assisted crowd evacuation based on deep reinforcement learning.The human-machine social force model is used to simulate the crowd evacuation state when an emergency occurs, and the complex spatial features in crowd evacuation scenarios are extracted by a designed convolutional neural network structure.The traditional deep Q-network is combined with Long Short-Term Memory(LSTM) network to solve the problem that robots cannot remember long-term temporal information in the learning process.Experimental results show that compared with the existing robot-assisted evacuation methods based on the human-machine social force model, the proposed algorithm improves the efficiency of robot-assisted evacuation in different simulation scenarios, which verifies its validity and feasibility.

Key words: Deep Spatio-Temporal Q-Network(DSTQN), Long Short-Term Memory(LSTM), crowd evacuation, robot, deep reinforcement learning

摘要: 针对目前人群疏散方法中机器人灵活性低、场景适应性有限与疏散效率低的问题,提出一种基于深度强化学习的机器人疏散人群算法。利用人机社会力模型模拟突发事件发生时的人群疏散状态,设计一种卷积神经网络结构提取人群疏散场景中复杂的空间特征,将传统的深度Q网络与长短期记忆网络相结合,解决机器人在学习中无法记忆长期时间信息的问题。实验结果表明,与现有基于人机社会力模型的机器人疏散人群方法相比,该算法能够提高在不同仿真场景中机器人疏散人群的效率,从而验证了算法的有效性。

关键词: 深度时空Q网络, 长短期记忆网络, 人群疏散, 机器人, 深度强化学习

CLC Number: