无人驾驶中运用DQN进行障碍物分类的避障方法

doi:10.19678/j.issn.1000-3428.0068769

摘要/Abstract

摘要：

安全是无人驾驶汽车需要考虑的首要因素, 而避障问题是解决驾驶安全最有效的手段。基于学习的避障方法因其能够从环境中学习并直接从感知中做出决策的能力而受到研究者的关注。深度Q网络(DQN)作为一种流行的强化学习方法, 在无人驾驶避障领域取得了很大的进展, 但这些方法未考虑障碍物类型对避障策略的影响。基于对障碍物的准确分类提出一种Classification Security DQN(CSDQN)的车辆行驶决策框架。根据障碍物的不同类型以及环境信息给出具有更高安全性的无人驾驶决策, 达到提高无人驾驶安全性的目的。首先对检测到的障碍物根据障碍物的安全性等级进行分类, 然后根据不同类型障碍物提出安全评估函数, 利用位置的不确定性和基于距离的安全度量来评估安全性, 接着CSDQN决策框架利用障碍物类型、相对位置信息以及安全评估函数进行不断迭代优化获得最终模型。仿真结果表明, 与先进的深度强化学习进行比较, 在多种障碍物的情况下, 采用CSDQN方法相较于DQN和SDQN方法分别提升了43.9%和4.2%的安全性, 以及17.8%和3.7%的稳定性。

关键词: 无人驾驶, 深度Q网络, 分类避障, 评估函数, 安全性

Abstract:

Safety is the primary factor to consider for unmanned driving vehicles, and obstacle avoidance is the most effective means to ensure driving safety. Learning-based obstacle avoidance methods have attracted the attention of researchers owing to their ability to learn from the environment and make decisions directly from perception. Deep Q-Network (DQN) has made significant progress as a popular reinforcement learning method in obstacle avoidance for autonomous driving; however, it does not consider the impact of the obstacle category on obstacle avoidance strategies. Therefore, we propose a vehicle-driving decision-making framework, the Classification Security DQN (CSDQN), which is based on accurate obstacle classification results. This framework aims to achieve higher safety in autonomous driving by finding safer decision strategies based on different obstacle types and environmental information. First, the detected obstacles are classified according to their safety levels, and then safety evaluation functions are proposed for different types of obstacles. The uncertainties in position- and distance-based safety measures are used to evaluate safety. The CSDQN decision-making framework utilizes obstacle categories, relative position information, and safety evaluation functions for continuous iterative optimization to obtain a final model. Finally, the proposed method is compared with advanced deep reinforcement learning. The simulation results show that in the presence of multiple obstacles, the CSDQN method improves safety by 43.9% and 4.2% and stability by 17.8% and 3.7%, respectively, compared to the DQN and SDQN methods.

Key words: unmanned driving, Deep Q-Network (DQN), classification obstacle avoidance, evaluation function, security

刘航博, 马礼, 李阳, 马东超, 傅颖勋. 无人驾驶中运用DQN进行障碍物分类的避障方法[J]. 计算机工程, 2024, 50(11): 380-389.

LIU Hangbo, MA Li, LI Yang, MA Dongchao, FU Yingxun. Obstacle Avoidance Method Using DQN to Classify Obstacles in Unmanned Driving[J]. Computer Engineering, 2024, 50(11): 380-389.

https://www.ecice06.com/CN/Y2024/V50/I11/380

图/表 7

图1 状态中的参数

Fig.1 Parameter in the state

图2 奖励函数R_type_safe图形化表示

Fig.2 Graphical representations of the reward function R_type_safe

图3 CSDQN体系结构

Fig.3 CSDQN architecture

图4 评估场景快照

Fig.4 Evaluate scene snapshots

图5 不同方法训练场景分数对比

Fig.5 Comparison of training scenario scores using different methods

图6 不同方法训练场景的安全性分数对比

Fig.6 Comparison of security scores for training scenarios using different methods

图7 不同方法随机场景行驶距离对比

Fig.7 Comparison of driving distances in random scenes using different methods

参考文献 34

1	TENG S Y, HU X M, DENG P, et al. Motion planning for autonomous driving: the state of the art and future perspectives. IEEE Transactions on Intelligent Vehicles, 2023, 8 (6): 3692- 3711. doi: 10.1109/TIV.2023.3274536
2	章军辉, 陈大鹏, 李庆. 自动驾驶技术研究现状及发展趋势. 科学技术与工程, 2020, 20 (9): 3394- 3403. doi: 10.3969/j.issn.1671-1815.2020.09.005
	ZHANG J H, CHEN D P, LI Q. Research status and development trend of technologies for autonomous vehicles. Science Technology and Engineering, 2020, 20 (9): 3394- 3403. doi: 10.3969/j.issn.1671-1815.2020.09.005
3	CHEN Y, CHEN S Z, REN H B, et al. Path tracking and handling stability control strategy with collision avoidance for the autonomous vehicle under extreme conditions. IEEE Transactions on Vehicular Technology, 2020, 69 (12): 14602- 14617. doi: 10.1109/TVT.2020.3031661
4	SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search. Nature, 2016, 529, 484- 489. doi: 10.1038/nature16961
5	GU S X, HOLLY E, LILLICRAP T, et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates[C]//Proceedings of IEEE International Conference on Robotics and Automation. Washington D. C., USA: IEEE Press, 2017: 3389-3396.
6	MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[EB/OL]. [2023-09-30]. https://arxiv.org/pdf/1602.01783.
7	BADUE C, GUIDOLINI R, CARNEIRO R V, et al. Self-driving cars: a survey. Expert Systems with Applications, 2021, 165, 113816. doi: 10.1016/j.eswa.2020.113816
8	茅智慧, 朱佳利, 吴鑫, 等. 基于YOLO的自动驾驶目标检测研究综述. 计算机工程与应用, 2022, 58 (15): 68- 77.
	MAO Z H, ZHU J L, WU X, et al. Review of YOLO based target detection for autonomous driving. Computer Engineering and Applications, 2022, 58 (15): 68- 77.
9	LI G F, QIU Y F, YANG Y F, et al. Lane change strategies for autonomous vehicles: a deep reinforcement learning approach based on transformer[EB/OL]. [2023-09-30]. https://arxiv.org/pdf/2304.13732.
10	HE Y, LIU Y, YANG L, et al. Deep adaptive control: deep reinforcement learning-based adaptive vehicle trajectory control algorithms for different risk levels[EB/OL]. [2023-09-30]. https://arxiv.org/pdf/2023.03408.
11	钱玉宝, 余米森, 郭旭涛, 等. 无人驾驶车辆智能控制技术发展. 科学技术与工程, 2022, 22 (10): 3846- 3858. doi: 10.3969/j.issn.1671-1815.2022.10.002
	QIAN Y B, YU M S, GUO X T, et al. Development of intelligent control technology for unmanned vehicle. Science Technology and Engineering, 2022, 22 (10): 3846- 3858. doi: 10.3969/j.issn.1671-1815.2022.10.002
12	WANG W R, ZHU M C, WANG X M, et al. An improved artificial potential field method of trajectory planning and obstacle avoidance for redundant manipulators. International Journal of Advanced Robotic Systems, 2018, 15 (5): 172988. doi: 10.1177/1729881418799562
13	FENG S, QIAN Y B, WANG Y. Collision avoidance method of autonomous vehicle based on improved artificial potential field algorithm. Journal of Automobile Engineering, 2021, 235 (14): 3416- 3430. doi: 10.1177/09544070211014319
14	周慧子, 胡学敏, 陈龙, 等. 面向自动驾驶的动态路径规划避障算法. 计算机应用, 2017, 37 (3): 883- 888.
	ZHOU H Z, HU X M, CHEN L, et al. Dynamic path planning for autonomous driving with avoidance of obstacles. Journal of Computer Applications, 2017, 37 (3): 883- 888.
15	FERACO S, LUCIANI S, BONFITTO A, et al. A local trajectory planning and control method for autonomous vehicles based on the RRT algorithm[C]//Proceedings of AEIT International Conference of Electrical and Electronic Technologies for Automotive. Washington D. C., USA: IEEE Press, 2020: 1-6.
16	HNEWA M, RADHA H. Object detection under rainy conditions for autonomous vehicles: a review of state-of-the-art and emerging techniques. IEEE Signal Processing Magazine, 2021, 38 (1): 53- 67. doi: 10.1109/MSP.2020.2984801
17	BEN ELALLID B, BENAMAR N, MRANI N, et al. DQN-based reinforcement learning for vehicle control of autonomous vehicles interacting with pedestrians[C]//Proceedings of International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies. Washington D. C., USA: IEEE Press, 2022: 489-493.
18	BIN ISSA R, DAS M, RAHMAN M S, et al. Double deep Q-learning and faster R-CNN-based autonomous vehicle navigation and obstacle avoidance in dynamic environment. Sensors, 2021, 21 (4): 1468. doi: 10.3390/s21041468
19	SAXENA D M, BAE S, NAKHAEI A, et al. Driving in dense traffic with model-free reinforcement learning[C]//Proceedings of IEEE International Conference on Robotics and Automation. Washington D. C., USA: IEEE Press, 2020: 5385-5392.
20	CODEVILLA F, MULLER M, LOPEZ A, et al. End-to-end driving via conditional imitation learning[C]//Proceedings of IEEE International Conference on Robotics and Automation. Washington D. C., USA: IEEE Press, 2018: 4693-4700.
21	SADAT A, CASAS S, REN M Y, et al. Perceive, predict, and plan: safe motion planning through interpretable semantic representations[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 414-430.
22	WANG Z, YAN Z H, NAKANO K. Comfort-oriented haptic guidance steering via deep reinforcement learning for individualized lane keeping assist[C]//Proceedings of IEEE International Conference on Systems, Man and Cybernetics. Washington D. C., USA: IEEE Press, 2019: 4283-4289.
23	ZHOU W, CHEN D, YAN J, et al. Multi-agent reinforcement learning for cooperative lane changing of connected and autonomous vehicles in mixed traffic. Autonomous Intelligent Systems, 2022, 2 (1): 5. doi: 10.1007/s43684-022-00023-5
24	LI X X, QIU X Y, WANG J, et al. A deep reinforcement learning based approach for autonomous overtaking[C]//Proceedings of IEEE International Conference on Communications. Washington D. C., USA: IEEE Press, 2020: 1-5.
25	MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518, 529- 533. doi: 10.1038/nature14236
26	MOUSAVI S S , SCHUKAT M , HOWLEY E . Deep reinforcement learning: an overview. Berlin, Germany: Springer, 2017.
27	KIRAN B R, SOBH I, TALPAERT V, et al. Deep reinforcement learning for autonomous driving: a survey. IEEE Transactions on Intelligent Transportation Systems, 2022, 23 (6): 4909- 4926.
28	HOEL C J, WOLFF K, LAINE L. Automated speed and lane change decision making using deep reinforcement learning[C]//Proceedings of the 21st International Conference on Intelligent Transportation Systems. Washington D. C., USA: IEEE Press, 2018: 2148-2155.
29	WANG J J, ZHANG Q C, ZHAO D B, et al. Lane change decision-making through deep reinforcement learning with rule-based constraints[C]//Proceedings of International Joint Conference on Neural Networks. Washington D. C., USA: IEEE Press, 2019: 1-6.
30	CYBENKO G. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 1989, 2 (4): 303- 314.
31	MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning[EB/OL]. [2023-09-30]. https://arxiv.org/pdf/1312.05602.
32	WANG Z Y, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learnings[C]//Proceedings of the 33rd International Conference on Machine Learning. Washington D. C., USA: IEEE Press, 2016: 2939-2947.
33	WANG G, HU J M, LI Z H, et al. Harmonious lane changing via deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems, 2022, 23 (5): 4642- 4650.
34	CHEN L, HU X M, TANG B, et al. Conditional DQN-based motion planning with fuzzy logic for autonomous driving. IEEE Transactions on Intelligent Transportation Systems, 2022, 23 (4): 2966- 2977.

[1]	张俊娜, 李天泽, 赵晓焱, 袁培燕. 一种基于DQN的去中心化优先级卸载策略[J]. 计算机工程, 2024, 50(9): 235-245.
[2]	徐晓滨, 张云硕, 施凡, 常雷雷, 陶志刚. 基于特征匹配度与异类子模型融合的安全性评估方法[J]. 计算机工程, 2024, 50(8): 113-122.
[3]	余新胜, 朱丹江, 罗论涵. 基于CLPN的系统安全性分析方法[J]. 计算机工程, 2024, 50(10): 255-265.
[4]	王程, 刘元盛, 刘圣杰. 基于改进YOLOv4的小目标行人检测算法[J]. 计算机工程, 2023, 49(2): 296-302,313.
[5]	李奇儒, 耿霞. 基于改进DQN算法的机器人路径规划[J]. 计算机工程, 2023, 49(12): 111-120.
[6]	丁晓晖, 曹素珍, 王彩芬. 智能合约辅助下满足前后向安全的动态可搜索加密方案[J]. 计算机工程, 2022, 48(7): 141-150.
[7]	张超, 彭长根, 丁红发, 许德权. 基于国密SM9的可搜索加密方案[J]. 计算机工程, 2022, 48(7): 159-167.
[8]	杜田, 李欣, 赖成喆, 郑东. 面向无人驾驶地图更新的安全信任管理方案[J]. 计算机工程, 2022, 48(6): 154-166.
[9]	宁小娟, 巩亮, 张金磊. 基于激光点云的道路可通行区域检测方法[J]. 计算机工程, 2022, 48(4): 22-29.
[10]	李秋贤, 周全兴, 王振龙, 丁红发, 潘齐欣. 基于隐私保护的可证明安全委托计算协议[J]. 计算机工程, 2021, 47(5): 131-137.
[11]	蔡爵嵩, 严迎建, 朱春生. 结合协方差与变异系数的密码芯片能量泄漏评估模型[J]. 计算机工程, 2021, 47(3): 37-42,52.
[12]	王琦, 曹卫权, 梁杰, 李赟, 吴杰. 面向端到端溯源攻击对手的Tor安全性模型[J]. 计算机工程, 2021, 47(11): 136-143.
[13]	武继刚, 刘同来, 李境一, 黄金瑶. 移动边缘计算中的区块链技术研究进展[J]. 计算机工程, 2020, 46(8): 1-13.
[14]	赵琪琪, 马慧芳, 刘海姣, 贾俊杰. 融合节点属性与结构信息的子空间异常社区检测方法[J]. 计算机工程, 2020, 46(6): 94-102.
[15]	陈建平, 周鑫, 傅启明, 高振, 付保川, 吴宏杰. 基于二阶时序差分误差的双网络DQN算法[J]. 计算机工程, 2020, 46(5): 78-85,93.

选择文件类型/文献管理软件名称

选择包含的内容