Abstract:
Aiming at the continuity consolidate study, this paper presents a Q-learning algorithm which integrates heuristic function and evaluation function. It takes advance of heuristic function to accelerate learning, uses evaluation function to reduce the unnecessary exploration and improves learning efficiency. To assure the effect of the algorithm, heuristic function and evaluation function are calculated by Q function. Simulation experimental result of the Tank game proves that the algorithm can improve the learning efficiency of Q-learning.
Key words:
Q-learning,
heuristic function,
evaluation function,
online game
摘要: 针对连续型强化学习问题,提出一种综合启发函数和评估函数的Q学习算法,利用启发函数加快学习速度,采用评估函数减少不必要的探索,提高学习效率。为了保证该算法的有效性,启发函数和评估函数根据Q函数进行计算。坦克大战游戏的仿真实验结果证明,该方法可以较大地提高Q学习的学习效率。
关键词:
Q学习,
启发函数,
评估函数,
网络游戏
CLC Number:
WANG Hong-yan.
Novel Heuristic Q-learning Algorithm
[J]. Computer Engineering, 2009, 35(22): 173-175.
王洪彦.
新的启发式Q学习算法
[J]. 计算机工程, 2009, 35(22): 173-175.