Dynamic Obstacle Avoidance for Service Robots Based on Spatio-Temporal Graph Attention Network

doi:10.19678/j.issn.1000-3428.0066799

Abstract

Abstract:

To solve the problems of collision, freezing, and the unnatural paths of service robots in dense crowds with autonomous decision-making ability, this study proposes a dynamic obstacle avoidance algorithm for service robots based on spatio-temporal graph attention network under the framework of Deep Reinforcement Learning(DRL). Spatio-temporal graph attention network represents the decision function of Proximal Policy Optimization(PPO) algorithm. First, the algorithm uses a Gated Recurrent Unit(GRU) to control the degree of memory and forgetting of the robot with respect to its environment and then extracts the time characteristics of that environment. This ensures the robot has a certain predictive effect on the movement trend of pedestrians. Second, the algorithm uses graph attention networks to obtain the spatially implicit interaction features between robots and pedestrians, enabling the robot to locate collision-free paths. Finally, the spatio-temporal graph attention network is trained under the PPO algorithm, which enables the robot to realize collision-free navigation tasks in a crowd. The algorithm is verified by simulation experiments in a dynamic closed environment of 2.5 m² per capita. Compared with the non-learning Dynamic Window Algorithm(DWA), the navigation success rate of the proposed algorithm is improved by 71 percentage points. In addition, compared with the learning-type DSRNN-RL algorithm, the navigation success rate of the proposed algorithm is improved by 3 percentage points and the navigation path is shorter. Finally, a real-time navigation test in the Gazebo environment shows that the average inference time of the algorithm is 21.90 ms, which meets the requirements of real-time navigation.

Key words: service robot, dynamic obstacle avoidance, Deep Reinforcement Learning(DRL), spatio-temporal graph attention network, real-time navigation

摘要：

为了解决服务机器人在具有自主决策能力的密集人群中容易发生碰撞、假死和路径不自然等问题，在深度强化学习的框架下提出基于时空图注意力网络的服务机器人动态避障算法。时空图注意力网络作为邻近策略优化（PPO）算法的决策函数，首先采用门控循环单元控制机器人对环境的记忆和遗忘程度，提取环境的时间特征，使其对行人运动趋势有一定的预测作用；然后采用图注意力网络获取机器人和行人在空间上的隐式交互特征，使机器人能寻找无碰撞路径；最后在PPO算法中对时空图注意力网络进行训练，使得机器人在人群中完成无碰撞导航任务。在人均2.5 m²的动态封闭环境中对算法进行实验验证，结果表明，与非学习型的动态窗口算法相比，该算法导航成功率提高71个百分点，与基于学习型的DSRNN-RL算法相比，该算法导航成功率提高3个百分点同时导航路径更短。Gazebo环境下的实时导航测试结果表明，所提算法的平均推理时间为21.90 ms，可以满足实时导航的要求。

关键词: 服务机器人, 动态避障, 深度强化学习, 时空图注意力网络, 实时导航

Haijun DU, Su YU. Dynamic Obstacle Avoidance for Service Robots Based on Spatio-Temporal Graph Attention Network[J]. Computer Engineering, 2024, 50(2): 105-112.

杜海军, 余粟. 基于时空图注意力网络的服务机器人动态避障[J]. 计算机工程, 2024, 50(2): 105-112.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0066799

http://www.ecice06.com/EN/Y2024/V50/I2/105

Figures/Tables 8

Fig.1 Navigation environment schematic diagram

Fig.2 Navigation spatio-temporal graph

Fig.3 Structure of spatio-temporal graph attention network

Fig.4 The attention values of robots towards pedestrians

Fig.5 Robot navigation trajectories

References 26

1	ZHU K, ZHANG T. Deep reinforcement learning based mobile robot navigation: a review. Tsinghua Science and Technology, 2021, 26 (5): 674- 691. doi: 10.26599/TST.2021.9010012
2	KARUR K, SHARMA N, DHARMATTI C, et al. A survey of path planning algorithms for mobile robots. Vehicles, 2021, 3 (3): 448- 468. doi: 10.3390/vehicles3030027
3	VAN DEN BERG J, LIN M, MANOCHA D. Reciprocal velocity obstacles for real-time multi-agent navigation[C]//Proceedings of 2008 IEEE International Conference on Robotics and Automation. Washington D. C., USA: IEEE Press, 2008: 1928-1935.
4	ZANLUNGO F, IKEDA T, KANDA T. Social force model with explicit collision prediction. EPL (Europhysics Letters), 2011, 93 (6): 68005. doi: 10.1209/0295-5075/93/68005
5	OGREN P, LEONARD N E. A convergent dynamic window approach to obstacle avoidance. IEEE Transactions on Robotics, 2005, 21 (2): 188- 195. doi: 10.1109/TRO.2004.838008
6	ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al. Deep reinforcement learning: a brief survey. IEEE Signal Processing Magazine, 2017, 34 (6): 26- 38. doi: 10.1109/MSP.2017.2743240
7	孙世光, 兰旭光, 张翰博, 等. 基于模型的机器人强化学习研究综述. 模式识别与人工智能, 2022, 35 (1): 1- 16. doi: 10.16451/j.cnki.issn1003-6059.202201001
	SUN S G, LAN X G, ZHANG H B, et al. Model-based reinforcement learning in robotics: a survey. Pattern Recognition and Artificial Intelligence, 2022, 35 (1): 1- 16. doi: 10.16451/j.cnki.issn1003-6059.202201001
8	SUN H H, ZHANG W J, YU R X, et al. Motion planning for mobile robots—focusing on deep reinforcement learning: a systematic review. IEEE Access, 2021, 9, 69061- 69081. doi: 10.1109/ACCESS.2021.3076530
9	张瀚, 解明扬, 张民, 等. 融合DDPG算法的移动机器人路径规划研究. 控制工程, 2021, 28 (11): 2136- 2142. URL
	ZHANG H, XIE M Y, ZHANG M, et al. Path planning of mobile robot with fusion DDPG algorithm. Control Engineering of China, 2021, 28 (11): 2136- 2142. URL
10	SATHYAMOORTHY A J, LIANG J, PATEL U, et al. DenseCAvoid: real-time navigation in dense crowds using anticipatory behaviors[C]//Proceedings of 2020 IEEE International Conference on Robotics and Automation. Washington D. C., USA: IEEE Press, 2020: 11345-11352.
11	CHEN Y F, LIU M, EVERETT M, et al. Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning[C]//Proceedings of 2017 IEEE International Conference on Robotics and Automation. Washington D. C., USA: IEEE Press, 2017: 285-292.
12	EVERETT M, CHEN Y F, HOW J P. Motion planning among dynamic, decision-making agents with deep reinforcement learning[C]//Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Washington D. C., USA: IEEE Press, 2018: 3052-3059.
13	刘国名, 李彩虹, 李永迪, 等. 基于改进PPO算法的机器人局部路径规划. 计算机工程, 2023, 49 (2): 119-126, 135. URL
	LIU G M, LI C H, LI Y D, et al. Local path planning of robot based on improved PPO algorithm. Computer Engineering, 2023, 49 (2): 119-126, 135. URL
14	WU Z H, PAN S R, CHEN F W, et al. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32 (1): 4- 24. doi: 10.1109/TNNLS.2020.2978386
15	王健宗, 孔令炜, 黄章成, 等. 图神经网络综述. 计算机工程, 2021, 47 (4): 1- 12. URL
	WANG J Z, KONG L W, HUANG Z C, et al. Survey of graph neural network. Computer Engineering, 2021, 47 (4): 1- 12. URL
16	CHEN C G, HU S, NIKDEL P, et al. Relational graph learning for crowd navigation[C]//Proceedings of 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Washington D. C., USA: IEEE Press, 2020: 10007-10013.
17	LIU S J, CHANG P X, LIANG W H, et al. Decentralized Structural-RNN for robot crowd navigation with deep reinforcement learning[C]//Proceedings of 2021 IEEE International Conference on Robotics and Automation. Washington D. C., USA: IEEE Press, 2021: 3517-3524.
18	CHEN Y Y, LIU C C, SHI B E, et al. Robot navigation in crowds by graph convolutional networks with attention learned from human gaze. IEEE Robotics and Automation Letters, 2020, 5 (2): 2754- 2761. doi: 10.1109/LRA.2020.2972868
19	DEY R, SALEM F M. Gate-variants of Gated Recurrent Unit(GRU) neural networks[C]//Proceedings of 2017 IEEE International Midwest Symposium on Circuits and Systems. Washington D. C., USA: IEEE Press, 2017: 1597-1600.
20	VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. [2022-12-05]. https://arxiv.org/abs/1710.10903.pdf.
21	SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[EB/OL]. [2022-12-05]. https://arxiv.org/abs/1707.06347.pdf.
22	ZHANG X Y, XI W, GUO X, et al. Relational navigation learning in continuous action space among crowds[C]//Proceedings of 2021 IEEE International Conference on Robotics and Automation. Washington D. C., USA: IEEE Press, 2021: 3175-3181.
23	孙立香, 孙晓娴, 刘成菊, 等. 人群环境中基于深度强化学习的移动机器人避障算法. 信息与控制, 2022, 51 (1): 107- 118. URL
	SUN L X, SUN X X, LIU C J, et al. Obstacle avoidance algorithm for mobile robot based on deep reinforcement learning in crowd environment. Information and Control, 2022, 51 (1): 107- 118. URL
24	ZHOU Z Q, ZHU P M, ZENG Z W, et al. Robot navigation in a crowd by integrating deep reinforcement learning and online planning. Applied Intelligence, 2022, 52 (13): 15600- 15616. doi: 10.1007/s10489-022-03191-2
25	胡琴, 赵一亭, 夏方平, 等. 基于Soft-Actor-Critic算法的机器人局部路径规划算法. 武汉理工大学学报, 2021, 43 (9): 79- 84. URL
	HU Q, ZHAO Y T, XIA F P, et al. Robot local path planning algorithm based on Soft-Actor-Critic algorithm. Journal of Wuhan University of Technology, 2021, 43 (9): 79- 84. URL
26	KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL]. [2022-12-05]. https://arxiv.org/abs/1412.6980.pdf.

[1]	Ziyue CAI, Beihai TAN, Rong YU, Xumin HUANG, Siming WANG. Dynamic Blockchain Sharding for 6G Internet of Things Devices Collaboration [J]. Computer Engineering, 2024, 50(1): 50-59.
[2]	Linghui KONG, Zheheng RAO, Yanyan XU, Shaoming PAN. Intelligent Routing Algorithm for Wireless Networks Based on Deep Reinforcement Learning [J]. Computer Engineering, 2023, 49(9): 199-207, 216.
[3]	Guanying ZHANG, Peng YI, Dan LI, Di ZHU, Ming MAO. Service Function Chain Deployment Method for Large-Scale Network [J]. Computer Engineering, 2023, 49(8): 122-129.
[4]	Jing MEI, Longbao DAI, Zhao TONG, Xin DENG, Jiake WANG. Adaptive Offloading Algorithm Based on Lyapunov Optimization Under Resource Constraints [J]. Computer Engineering, 2023, 49(7): 34-46.
[5]	LI Qiang, YI Jinhui, DU Tingting, WANG Shengchun. Dependent Task Offloading and Resource Allocation Based on A3C in Mobile Edge Computing [J]. Computer Engineering, 2023, 49(6): 42-52.
[6]	LI Zifan, WANG Hao, FANG Baofu. A Method for Multi-Agent Cooperation Based on Multi-Step Dueling Network [J]. Computer Engineering, 2022, 48(5): 74-81.
[7]	YU Jing, LU Lingyun, LI Xiang. Edge-Cloud Collaborative Task Offloading Mechanism Based on DDQN in Vehicular Networks [J]. Computer Engineering, 2022, 48(12): 156-164.
[8]	LIU Xianfeng, LIANG Sai, LI Qiang, ZHANG Jin. Cloud-Edge Collaborative DNN Inference Based on Deep Reinforcement Learning [J]. Computer Engineering, 2022, 48(11): 30-38.
[9]	YANG Wenqi, ZHANG Yang, NIE Jiangtian, YANG Helin, KANG Jiawen, XIONG Zehui. Energy and Information Management Strategy Based on Federated Learning for Wireless Network Nodes [J]. Computer Engineering, 2022, 48(1): 188-196,203.
[10]	YANG Tian, YANG Jun. Deep Reinforcement Learning Method of Offloading Decision and Resource Allocation in MEC [J]. Computer Engineering, 2021, 47(8): 37-44.
[11]	YANG Siming, SHAN Zheng, DING Yu, LI Gangwei. Survey of Research on Deep Reinforcement Learning [J]. Computer Engineering, 2021, 47(12): 19-29.
[12]	JIN Jiuyi, QIU Gongan. Joint Optimization of Resource Allocation and Power Control in C-V2X Communications [J]. Computer Engineering, 2021, 47(10): 147-152.
[13]	YAN Jiaojie, ZHANG Qieshi, HU Xiping. Review of Path Planning Techniques Based on Reinforcement Learning [J]. Computer Engineering, 2021, 47(10): 16-25.
[14]	CHEN Jianping, ZHOU Xin, FU Qiming, GAO Zhen, FU Baochuan, WU Hongjie. Dual Network DQN Algorithm Based on Second-order Temporal Difference Error [J]. Computer Engineering, 2020, 46(5): 78-85,93.
[15]	ZHI Shao-Lei, JIA Ji-Jiang, ZHANG Jiong. Design of Service Robot Monitoring System Based on OPC [J]. Computer Engineering, 2012, 38(14): 266-268.

Please choose a citation manager

Content to export