Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

PPO-Based Algorithm for Indoor Crowd Evacuation

  

  • Published:2025-04-24

基于PPO的室内人群疏散路径规划算法

Abstract: With the frequent occurrence of terrorist attacks, the crowd evacuation path planning problem in indoor public places has received increasing attention. Aiming at the crowd evacuation path planning problem in indoor public places, a path planning method based on Proximal Policy Optimization (PPO) algorithm is proposed to improve the efficiency and safety of pedestrian evacuation. First, the indoor terrorist attack scenario is described, and the static obstacles, idle locations, dynamic obstacles, exits and pedestrians in indoor public places are modeled using a cellular automata model. On this basis, a feature construction method based on distance information is proposed to construct pedestrian features including shortest path features and safe path features by combining the distance from pedestrians to exits in non-threatening environments and threat-facing scenarios, so as to portray the escape difficulty of evacuation paths. Finally, by describing the evacuation path planning problem as a reinforcement learning problem, a reward function based on evacuation efficiency, death penalty and successful escape reward is designed. Through the feedback of the real-time environment, the evacuation strategy is provided to the pedestrians, which in turn realizes the overall optimization of the escape path by the PPO algorithm. Compared with existing field methods, this method can improve the efficiency and safety of crowd evacuation in different simulation scenarios, especially in complex and high-density environments. Meanwhile, the effectiveness of the shortest path feature and the safe path feature is verified by ablation experiments.

摘要: 随着恐怖袭击事件的频发,现有的疏散路径规划方法在面对复杂、动态环境时显得力不从心,尤其在面临突发恐怖分子袭击时,无法有效应对路径的阻塞和危险因素的变化。因此,如何在复杂和高密度环境中实现高效、安全的人群疏散成为一个亟待解决的技术问题。针对室内公共场所人群疏散路径规划问题,提出一种基于近端策略优化(Proximal Policy Optimization, PPO)算法的路径规划方法,以提高行人疏散效率和安全性。首先,对室内恐怖袭击场景进行描述,采用元胞自动机模型对室内公共场所的静态障碍物、空闲位置、动态障碍物、出口和行人进行建模。在此基础上,提出基于距离信息的特征构建方法,结合无威胁环境和面临威胁情况下行人到出口的距离,构建包括最短路径特征和安全路径特征在内的行人特征,以此刻画疏散路径的逃生难度。最后,通过将疏散路径规划问题描述为强化学习问题,设计基于疏散效率、死亡惩罚和成功逃离奖励的奖励函数。通过实时环境的反馈,为行人提供疏散策略,进而实现PPO算法对逃生路径的整体优化。与现有场域方法相比,该方法能够提高不同仿真场景下的人群疏散效率和安全性,尤其在复杂和高密度环境中。同时,通过消融实验验证了最短路径特征和安全路径特征的有效性。