作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2020, Vol. 46 ›› Issue (6): 50-59. doi: 10.19678/j.issn.1000-3428.0054479

• 人工智能与模式识别 • 上一篇    下一篇

融合深度学习与搜索的实时策略游戏微操方法

陈鹏, 王子磊   

  1. 中国科学技术大学 自动化系, 合肥 230027
  • 收稿日期:2019-04-03 修回日期:2019-05-08 发布日期:2019-05-24
  • 作者简介:陈鹏(1994-),男,硕士研究生,主研方向为机器博弈;王子磊,副教授。
  • 基金资助:
    国家自然科学基金(61836008,61673362)。

Micromanipulation Method Combining Deep Learning and Search for Real-Time Strategy Game

CHEN Peng, WANG Zilei   

  1. Department of Automation, University of Science and Technology of China, Hefei 230027, China
  • Received:2019-04-03 Revised:2019-05-08 Published:2019-05-24

摘要: 实时策略游戏的微操是指操纵多个作战单元以赢得胜利,针对传统搜索方法在面对大规模战斗场景时存在的搜索效率低下、搜索空间有限等问题,提出深度学习与在线搜索相结合的方法,以实现学习模型对搜索过程的引导。给出一种基于编码-解码卷积架构的联合策略网络,将其嵌入到PGS、POE和SSS+3种经典搜索方法中,实现多智能体联合动作的端到端学习。实验结果表明,该方法可以适应复杂的作战场景,在StarCraft:BroodWar的2个基准场景中能够击败内置人工智能方法,胜率分别达到95%、99%,接近当前最好的基准方法。

关键词: 实时策略游戏, 微操, 深度学习, 联合策略网络, 搜索方法

Abstract: Micromanipulation of Real-Time Strategy(RTS) game refers to manipulating multiple combat units to win in a game.Traditional search methods are inefficient and have limited search space in large-scale battle scenarios.To address the problem,this paper proposes a method combining deep learning with online search,using the learning model to guide the search process.A Joint Policy Network(JPN) based on encoding-decoding convolution architecture is given and embedded into three classic search methods:PGS,POE,and SSS+,so as to realize end-to-end learning of joint actions of multiple agents.Experimental results show that the method can adapt to complex combat scenarios,defeating the built-in artificial intelligence method in the two benchmark scenarios of StarCraft:BroodWar,and the winning rates are 95% and 99% respectively,which are close to the current best benchmark method.

Key words: Real-Time Strategy(RTS) game, micromanipulation, deep learning, Joint Policy Network(JPN), search methods

中图分类号: