Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2024, Vol. 50 ›› Issue (2): 105-112. doi: 10.19678/j.issn.1000-3428.0066799

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Dynamic Obstacle Avoidance for Service Robots Based on Spatio-Temporal Graph Attention Network

Haijun DU*(), Su YU   

  1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Received:2023-01-18 Online:2024-02-15 Published:2023-05-05
  • Contact: Haijun DU

基于时空图注意力网络的服务机器人动态避障

杜海军*(), 余粟   

  1. 上海工程技术大学电子电气工程学院, 上海 201620
  • 通讯作者: 杜海军
  • 基金资助:
    上海市科委科研计划项目(17511110204)

Abstract:

To solve the problems of collision, freezing, and the unnatural paths of service robots in dense crowds with autonomous decision-making ability, this study proposes a dynamic obstacle avoidance algorithm for service robots based on spatio-temporal graph attention network under the framework of Deep Reinforcement Learning(DRL). Spatio-temporal graph attention network represents the decision function of Proximal Policy Optimization(PPO) algorithm. First, the algorithm uses a Gated Recurrent Unit(GRU) to control the degree of memory and forgetting of the robot with respect to its environment and then extracts the time characteristics of that environment. This ensures the robot has a certain predictive effect on the movement trend of pedestrians. Second, the algorithm uses graph attention networks to obtain the spatially implicit interaction features between robots and pedestrians, enabling the robot to locate collision-free paths. Finally, the spatio-temporal graph attention network is trained under the PPO algorithm, which enables the robot to realize collision-free navigation tasks in a crowd. The algorithm is verified by simulation experiments in a dynamic closed environment of 2.5 m2 per capita. Compared with the non-learning Dynamic Window Algorithm(DWA), the navigation success rate of the proposed algorithm is improved by 71 percentage points. In addition, compared with the learning-type DSRNN-RL algorithm, the navigation success rate of the proposed algorithm is improved by 3 percentage points and the navigation path is shorter. Finally, a real-time navigation test in the Gazebo environment shows that the average inference time of the algorithm is 21.90 ms, which meets the requirements of real-time navigation.

Key words: service robot, dynamic obstacle avoidance, Deep Reinforcement Learning(DRL), spatio-temporal graph attention network, real-time navigation

摘要:

为了解决服务机器人在具有自主决策能力的密集人群中容易发生碰撞、假死和路径不自然等问题,在深度强化学习的框架下提出基于时空图注意力网络的服务机器人动态避障算法。时空图注意力网络作为邻近策略优化(PPO)算法的决策函数,首先采用门控循环单元控制机器人对环境的记忆和遗忘程度,提取环境的时间特征,使其对行人运动趋势有一定的预测作用;然后采用图注意力网络获取机器人和行人在空间上的隐式交互特征,使机器人能寻找无碰撞路径;最后在PPO算法中对时空图注意力网络进行训练,使得机器人在人群中完成无碰撞导航任务。在人均2.5 m2的动态封闭环境中对算法进行实验验证,结果表明,与非学习型的动态窗口算法相比,该算法导航成功率提高71个百分点,与基于学习型的DSRNN-RL算法相比,该算法导航成功率提高3个百分点同时导航路径更短。Gazebo环境下的实时导航测试结果表明,所提算法的平均推理时间为21.90 ms,可以满足实时导航的要求。

关键词: 服务机器人, 动态避障, 深度强化学习, 时空图注意力网络, 实时导航