摘要: 针对多智能体Q-学习中存在的联合动作指数级增长问题,采用一种局部合作的Q-学习方法,在智能体之间有协作时才考察联合动作,否则只进行简单的个体智能体的Q-学习,从而减少学习时所要考察的状态-动作对值。在机器人足球仿真2D平台上进行的实验表明,该方法比常用多智能体强化学习技术具有更高的效率。
关键词:
马尔可夫决策,
Q-学习,
局部合作,
仿真2D
Abstract: Many multi-Agent Q-learning problems can not be solved because the number of joint actions is exponential in the number of Agents, rendering this approach infeasible for most problems. This paper investigates a regional cooperative of the Q-function by only considering the joint actions in those states in which coordination is actually required. In all other states single-Agent Q-learning is applied. This paper offers a compact state-action value representation, without compromising much in terms of solution quality. It performs experiments in RoboCup-simulation 2D which is the ideal testing platform of multi-agent systems and compared the algorithm to other multi-Agent reinforcement learning algorithms with promising results.
Key words:
Markov Decision Processes(MDP),
Q-learning,
regional cooperative,
simulation 2D
中图分类号:
刘 亮;李龙澍. 基于局部合作的RoboCup多智能体Q-学习[J]. 计算机工程, 2009, 35(9): 11-13,1.
LIU Liang; LI Long-shu. Multi-Agent Q-learning in RoboCup Based on Regional Cooperative[J]. Computer Engineering, 2009, 35(9): 11-13,1.