基于强化学习的无人驾驶匝道汇入模型

doi:10.19678/j.issn.1000-3428.0050990

计算机工程 ›› 2018, Vol. 44 ›› Issue (7): 20-24,31. doi: 10.19678/j.issn.1000-3428.0050990

所属专题：智能交通专题；

基于强化学习的无人驾驶匝道汇入模型

乔良^a,鲍泓^a,玄祖兴^a,梁军^a,潘峰^b

传统的强化学习方法受离散状态空间和离散动作空间的限制,不能很好地应用于匝道汇入场景。为此,构建一种基于强化学习的无人驾驶匝道汇入模型。使用深度Q网络构建强化学习模型,依据该模型将匝道汇入问题纳入强化学习问题的范畴后进行求解。实验结果表明,该模型可以针对不同的环境车辆速度采取不同的策略,从而提高无人驾驶在匝道汇入场景下的智能化决策水平。

收稿日期:2018-01-28 出版日期:2018-07-15 发布日期:2018-07-15
作者简介:乔良(1991—),男,硕士研究生,主研方向为无人驾驶决策与控制;鲍泓,教授;玄祖兴,副教授;梁军,教授;潘峰,副教授。
基金资助:
国家自然科学基金“视听觉信息的认知计算”重大研究计划重点支持项目“智能车驾驶脑认知技术、平台与转化研究”(91420202);北京市教委科研计划项目(KM201811417006);英国皇家工程院牛顿基金(UK-CIAPP＼324);北京市属高校高水平教师队伍建设支持计划项目 (IDHT20170511)。

Autonomous Driving Ramp Merging Model Based on Reinforcement Learning

QIAO Liang ^a,BAO Hong ^a,XUAN Zuxing^a,LIANG Jun ^a,PAN Feng ^b

Autonomous Driving Ramp Merging Model Based on Reinforcement Learning

Received:2018-01-28 Online:2018-07-15 Published:2018-07-15

摘要/Abstract

摘要：

传统的强化学习方法受离散状态空间和离散动作空间的限制,不能很好地应用于匝道汇入场景。为此,构建一种基于强化学习的无人驾驶匝道汇入模型。使用深度Q网络构建强化学习模型,依据该模型将匝道汇入问题纳入强化学习问题的范畴后进行求解。实验结果表明,该模型可以针对不同的环境车辆速度采取不同的策略,从而提高无人驾驶在匝道汇入场景下的智能化决策水平。

关键词: 无人驾驶, 决策, 匝道汇入, 强化学习, 深度Q网络

Abstract:

The traditional reinforcement learning method is limited by discrete state space and discrete action space,and can not be applied to ramp merging scene.Therefore,a reinforcement learning based autonomous driving ramp merging model is constructed.The reinforcement learning model is built by deep Q network.The ramp merging problem is incorporated into the category of reinforcement learning problem and solved.Experimental results show that the model can adopt different strategies for different environment vehicle speeds,thus improving the intelligent decision-making level of the autonomous driving in ramp merging scene.

Key words: autonomous driving, decision-making, ramp merging, reinforcement learning, deep Q network

中图分类号:

TP393

乔良,鲍泓,玄祖兴,梁军,潘峰. 基于强化学习的无人驾驶匝道汇入模型[J]. 计算机工程, 2018, 44(7): 20-24,31.

QIAO Liang,BAO Hong,XUAN Zuxing,LIANG Jun,PAN Feng. Autonomous Driving Ramp Merging Model Based on Reinforcement Learning[J]. Computer Engineering, 2018, 44(7): 20-24,31.

http://www.ecice06.com/CN/Y2018/V44/I7/20

参考文献

［1］URMSON C,ANHALT J,BAGNELL D,et al.Autonomous driving in urban environments:boss and the urban challenge［J］.Journal of Field Robotics,2008,25(8):425-466.
［2］DONG C,DOLAN J M,LITKOUHI B.Intention estimation for ramp merging control in autonomous driving［C］//Proceedings of 2017 IEEE Intelligent Vehicles Symposium.Washington D.C.,USA:IEEE Press,2017:1584-1589.
［3］HAFNER M R,CUNNINGHAM D,CAMINITI L,et al.Cooperative collision avoidance at intersections:algorithms and experiments ［J］.IEEE Transactions on Intelligent Transportation Systems,2013,14(3):1162-1175.
［4］ALONSO J,MILAN S V,REZ J,et al.Autonomous vehicle control systems for safe crossroads［J］.Transportation Research,Part C:Emerging Technologies,2011,19(6):1095-1110.
［5］MARINESCU D,URN J,BOUROCHE M,et al.On-ramp traffic merging using cooperative intelligent vehicles:a slot-based approach［C］//Proceedings of the 15th IEEE International Conference on Intelligent Transportation Systems.Washington D.C.,USA:IEEE Press,2012:900-906.
(下转第31页) (上接第24页) ［6］WEI J,DOLAN J M,LITKOUHI B.Autonomous vehicle social behavior for highway entrance ramp management［C］//Proceedings of 2013 IEEE Intelligent Vehicles Symposium.Washington D.C.,USA:IEEE Press,2013:201-207.
［7］罗霞,何彪,刘硕智,等.车联网环境下交叉口车辆路径优化控制研究［J］.计算机仿真,2017,34(4):166-171.
［8］HORST R V D,HOGEMA J.Time-to-collision and collision avoidance systems ［EB/OL］.［2018-01-25］.http://www.ictct.org/migrated_2014/ictct_document_nr_365_Horst.pdf.
［9］COSGUN A,MA L,CHIU J,et al.Towards full automated drive in urban environments:a demonstration in gomentum station,California［C］//Proceedings of 2017 IEEE Intelligent Vehicles Symposium.Washington D.C.,USA:IEEE Press,2017:1120- 1128.
［10］ISELE D,COSGUN A,SUBRAMANIAN K,et al.Navigating intersections with autonomous vehicles using deep reinforcement learning［EB/OL］.［2018-01-25］.http://www.doc88.com/p-3813505208748.html.
［11］LITTMAN M L.Reinforcement learning improves beha-viour from evaluative feedback［J］.Nature,2015,521(7553):445-451.
［12］赵冬斌,邵坤,朱圆恒,等.深度强化学习综述:兼论计算机围棋的发展［J］.控制理论与应用,2016,33(6):701-717.
［13］SILVER D,HUANG A,MADDISON C J,et al.Mastering the game of go with deep neural networks and tree search［J］.Nature,2016,529(7587):484-489.
［14］SILVER D,SCHRITTWIESER J,SIMONYAN K,et al.Mastering the game of go without human know-ledge［J］.Nature,2017,550(7676):354-359.
［15］MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atari with deep reinforcement learning ［EB/OL］.［2018-01-25］.http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Publications_files/dqn.pdf.
［16］刘全,翟建伟,章宗长,等.深度强化学习综述［J］.计算机学报,2018,41(1):1-27.

[1]	张冠莹, 伊鹏, 李丹, 朱棣, 毛明. 面向大规模网络的服务功能链部署方法[J]. 计算机工程, 2023, 49(8): 122-129.
[2]	梅晶, 戴龙宝, 童钊, 邓昕, 王嘉珂. 资源约束下基于Lyapunov优化的自适应卸载算法[J]. 计算机工程, 2023, 49(7): 34-46.
[3]	蔡丽娇, 秦进, 陈双. 远离旧区域和避免回路的强化探索方法[J]. 计算机工程, 2023, 49(7): 118-124.
[4]	张尊栋, 王岩楠, 周慧娟, 张艺帆. Q学习演化博弈中决策机制对网络合作水平的影响[J]. 计算机工程, 2023, 49(6): 99-106,114.
[5]	李强, 仪晋辉, 杜婷婷, 王胜春. 移动边缘计算中基于A3C的依赖任务卸载与资源分配[J]. 计算机工程, 2023, 49(6): 42-52.
[6]	何建江, 陈玉玲. 基于DLIN加密的可监管联盟链隐私保护方案[J]. 计算机工程, 2023, 49(6): 170-179.
[7]	金志军, 王浩, 方宝富. 稀疏场景下基于理性好奇心的多智能体强化学习[J]. 计算机工程, 2023, 49(5): 302-309.
[8]	王博, 张远, 杨咏蓓. 基于模仿学习的决策树码率自适应算法研究[J]. 计算机工程, 2023, 49(5): 206-214.
[9]	王磊, 王楠. 新Schweizer Sklar范数图模糊算子与决策应用[J]. 计算机工程, 2023, 49(4): 92-100.
[10]	王程, 刘元盛, 刘圣杰. 基于改进YOLOv4的小目标行人检测算法[J]. 计算机工程, 2023, 49(2): 296-302,313.
[11]	饶东宁, 罗南岳. 基于多任务强化学习的堆垛机调度与库位推荐[J]. 计算机工程, 2023, 49(2): 279-287,295.
[12]	宋健, 王子磊. 基于值分解的多目标多智能体深度强化学习方法[J]. 计算机工程, 2023, 49(1): 31-40.
[13]	吴仍裕, 周强, 于海龙, 王亚沙. 基于深度强化学习的深圳市急救车调度算法[J]. 计算机工程, 2022, 48(9): 298-304.
[14]	甘红楠, 张凯. 参数自适应下基于近邻图的近似最近邻搜索[J]. 计算机工程, 2022, 48(9): 28-36.
[15]	赵寅甫, 冯正勇. 基于深度强化学习的机械臂控制快速训练方法[J]. 计算机工程, 2022, 48(8): 113-120.

选择文件类型/文献管理软件名称

选择包含的内容

基于强化学习的无人驾驶匝道汇入模型

Autonomous Driving Ramp Merging Model Based on Reinforcement Learning

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于强化学习的无人驾驶匝道汇入模型

Autonomous Driving Ramp Merging Model Based on Reinforcement Learning

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价