作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2020, Vol. 46 ›› Issue (11): 90-96. doi: 10.19678/j.issn.1000-3428.0055904

• 人工智能与模式识别 • 上一篇    下一篇

基于多智能体协同强化学习的多目标追踪方法

王毅然1, 经小川1,2, 贾福凯2, 孙宇健2, 佟轶2   

  1. 1. 中国航天系统科学与工程研究院, 北京 100048;
    2. 航天宏康智能科技(北京)有限公司, 北京 100048
  • 收稿日期:2019-09-03 修回日期:2019-11-11 发布日期:2020-11-10
  • 作者简介:王毅然(1994-),男,硕士研究生,主研方向为目标跟踪、多智能体;经小川,研究员;贾福凯、孙宇健、佟轶,工程师。
  • 基金资助:
    广东省应用型科技研发基金(2016B010127005)。

Multi-Target Tracking Method Based on Multi-Agent Collaborative Reinforcement Learning

WANG Yiran1, JING Xiaochuan1,2, JIA Fukai2, SUN Yujian2, TONG Yi2   

  1. 1. China Aerospace Academy of Systems Science and Engineering, Beijing 100048, China;
    2. Aerospace Hongkang Intelligent Technology(Beijing) Co., Ltd., Beijing 100048, China
  • Received:2019-09-03 Revised:2019-11-11 Published:2020-11-10

摘要: 针对现有多目标追踪方法通常存在学习速度慢、追踪效率低及协同追踪策略设计困难等问题,提出一种改进的多目标追踪方法。基于追踪智能体和目标智能体数量及其环境信息建立任务分配模型,运用匈牙利算法根据距离效益矩阵对其进行求解得到多个追踪智能体的任务分配情况,并以缩短目标智能体的追踪路径为优化目标进行任务分工,同时利用多智能体协同强化学习算法使多个智能体在相同环境中不断重复执行探索-积累-学习-决策过程,最终根据经验数据更新策略完成多目标追踪任务。仿真结果表明,与DDPG和MADDPG方法相比,该方法能在避免碰撞和躲避障碍物的情况下,使多个智能体通过相互协作形成针对多个运动目标的最短追踪路线。

关键词: 多智能体, 多目标追踪, 强化学习, 任务分配, 实时性

Abstract: There are multiple problems with existing multi-target tracking methods,including low learning speed,inefficient tracking and high difficulty in collaborative tracking strategy design.To this end,this paper proposes an improved multi-target tracking method.The method builds a task assignment model based on the number of target agents and tracking agents and their environmental information.Then the model is solved by using Hungary algorithm according to the distance benefit matrix to acquire the task assignment information of multiple tracking agents,which is optimized to shorten the tracking paths of target agents.In addition,the multi-agent collaborative reinforcement learning algorithm is used to enable multiple agents to repeat the process of exploration-accumulation-learning-decision in the same environment and update the strategy based on empirical data to finally complete the multi-target tracking task.Simulation results show that compared with DDPG and MADDPG methods,the proposed method enables multiple agents to collaboratively form the shortest path for tracking multiple moving targets with collisions and obstacles avoided.

Key words: multi-agent, multi-target tracking, reinforcement learning, task assignment, real-time

中图分类号: