作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• •    

基于角色学习的多智能体强化学习方法研究

  • 发布日期:2025-04-01

Research on Role-Based Multi-Agent Reinforcement Learning Methods

  • Published:2025-04-01

摘要: 多智能体强化学习(MARL)在解决复杂协作任务中具有重要作用。然而,传统方法在动态环境和信息非平稳性方面存在显著局限性。针对这些挑战,本文提出了一种基于角色学习的多智能体强化学习框架(RoMAC)。该框架通过基于动作属性的角色划分,并借助角色分配网络实现智能体角色的动态分配,以提升多智能体协作效率。同时,框架采用分层通信设计,包括基于注意力机制的角色间通信和基于互信息的智能体间通信。在角色间通信中,RoMAC利用注意力机制生成高效的通信信息,以实现角色代理间的协调;在智能体间通信中,通过互信息生成有针对性的信息,从而提升角色组内部的决策质量。实验在星际争霸多智能体挑战(SMAC)环境中进行,结果表明,RoMAC胜率平均提高了约8.62%,收敛时间缩短了0.92百万步,通信负载平均降低了28.18%。消融实验进一步验证了RoMAC各模块在提升性能中的关键作用,体现了模型的稳健性与灵活性。综合实验结果表明,RoMAC在多智能体强化学习和协作任务中具有显著优势,为复杂任务的高效解决提供了可靠支持。

Abstract: Multi-agent reinforcement learning (MARL) plays a crucial role in solving complex cooperative tasks. However, traditional methods face significant limitations in dynamic environments and information non-stationarity. To address these challenges, this paper proposes a role-based multi-agent reinforcement learning framework (RoMAC). The framework employs role division based on action attributes and uses a role assignment network to dynamically allocate roles to agents, thereby enhancing the efficiency of multi-agent collaboration. Additionally, the framework adopts a hierarchical communication design, including inter-role communication based on attention mechanisms and inter-agent communication guided by mutual information. In inter-role communication, RoMAC leverages attention mechanisms to generate efficient communication messages for coordination between role delegates. In inter-agent communication, mutual information is used to produce targeted information, improving decision-making quality within role groups. Experiments conducted in the StarCraft Multi-Agent Challenge (SMAC) environment show that RoMAC achieves an average win rate improvement of approximately 8.62%, a reduction in convergence time by 0.92 million steps, and a 28.18% average decrease in communication load. Ablation studies further validate the critical contributions of each module in enhancing performance, showcasing the robustness and flexibility of the model. Overall, experimental results indicate that RoMAC offers significant advantages in multi-agent reinforcement learning and cooperative tasks, providing reliable support for efficiently addressing complex challenges.