作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (11): 38-48. doi: 10.19678/j.issn.1000-3428.0069838

• 智能态势感知与计算 • 上一篇    下一篇

基于MADDPG的多阵面相控阵雷达引导搜索资源优化算法

王腾1,*(), 黄俊松2, 王乐庭3, 张才坤4, 李枭扬1   

  1. 1. 西北工业大学电子信息学院, 陕西 西安 710129
    2. 上海机电工程研究所, 上海 201109
    3. 上海航天电子通讯设备研究所, 上海 201109
    4. 中国人民解放军63892部队, 河南 洛阳 471003
  • 收稿日期:2024-05-13 出版日期:2024-11-15 发布日期:2024-09-05
  • 通讯作者: 王腾
  • 基金资助:
    智能博弈重点实验室创新工作站开放课题(ZBKF-23-04)

Multi-Antenna Phased Array Radar-Guided Search Resource Optimization Algorithm Based on MADDPG

WANG Teng1,*(), HUANG Junsong2, WANG Leting3, ZHANG Caikun4, LI Xiaoyang1   

  1. 1. School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710129, Shaanxi, China
    2. Shanghai Institute of Mechanical and Electrical Engineering, Shanghai 201109, China
    3. Shanghai Aerospace Electronic and Communication Equipment Research Institute, Shanghai 201109, China
    4. Unit 63892 of the People's Liberation Army, Luoyang 471003, Henan, China
  • Received:2024-05-13 Online:2024-11-15 Published:2024-09-05
  • Contact: WANG Teng

摘要:

针对传统单阵面雷达搜索资源优化算法在复杂多阵面场景下的参数求解困难问题, 提出一种基于多智能体深度确定性策略梯度(MADDPG)的多阵面雷达搜索资源优化算法。考虑多阵面相控阵雷达场景约束, 结合机载雷达实际搜索任务需求, 建立基于最大目标平均积累期望发现概率的多阵面雷达搜索资源优化模型。分别设计多智能体局部及全局观测空间和带折扣因子的复合奖励函数, 基于执行者-评论者(Actor-Critic)算法结构, 通过各智能体策略网络在线更新各雷达阵面搜索资源分配系数实现上述模型参数的优化求解。仿真结果表明, 该算法能够根据空域-目标覆盖情况及各目标威胁权系数迅速作出精确的自主决策, 在多阵面相控阵雷达搜索资源优化场景下的表现显著优于传统算法。

关键词: 多阵面相控阵雷达, 雷达搜索资源优化, 多智能体深度强化学习, 深度确定性策略梯度, 集群目标雷达引导搜索

Abstract:

An optimization algorithm for multi-antenna phased array radar search parameters based on the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) is proposed to address the problem that search parameters of the traditional single-antenna phased array radar optimization model are difficult to solve for in complex multi-antenna phased array scenarios. First, considering the constraints of a multi-antenna phased array scenario and the actual search task requirements of airborne radar, an optimization model is established for multi-antenna radar search parameters based on the maximum accumulated discovery probability of cluster targets. Second, the local and global observation spaces of multi-agents and composite reward functions with discount factors are designed to update the allocation coefficients of search resources online for each antenna phased array and each agent strategy network based on the Actor-Critic structure. Finally, the simulation results show that the trained multi-agents of the proposed algorithm can quickly make accurate autonomous decisions based on the target airspace set-covering model and target guidance information. The performance of the proposed algorithm is significantly better than that of the traditional algorithm in the multi-antenna phased array scenario.

Key words: multi-antenna phased array radar, radar search resource optimization, multi-agent deep reinforcement learning, Deep Deterministic Policy Gradient (DDPG), radar-guided search for cluster target