作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (13): 190-192,. doi: 10.3969/j.issn.1000-3428.2009.13.066

• 人工智能及识别技术 • 上一篇    下一篇

基于因素化表示的TD(λ)算法

戴 帅,殷苌茗,张 欣   

  1. (长沙理工大学计算机与通信工程学院,长沙 410076)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-07-05 发布日期:2009-07-05

Algorithm of TD(λ) Based on Factored Representation

DAI Shuai, YIN Chang-ming, ZHANG Xin   

  1. (School of Computer & Communication Engineering, Changsha University of Science & Technology, Changsha 410076)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-07-05 Published:2009-07-05

摘要: 提出一种新的基于因素法方法的TD(λ)算法。其基本思想是状态因素化表示,通过动态贝叶斯网络表示Markov决策过程(MDP)中的状态转移概率函数,结合决策树表示TD(λ)算法中的状态值函数,降低状态空间的搜索与计算复杂度,因而适用于求解大状态空间的MDPs问题,实验证明该表示方法是有效的。

关键词: 因素化表示, 动态贝叶斯网络, 决策树, TD(λ)算法

Abstract: This paper proposes a new algorithm of TD(λ) based on factored representation. The main principle of the algorithm is that states are factored representation, and makes use of Dynamic Bayesian Networks(DBNs) to represent the conditional probability distributions in Markov Decision Processes(MDPs), together with decision-trees representation of value function in the algorithm of TD(λ) to lower the state space exploration and computation complexity. Therefore the algorithm is a promise for solving large-scale MDPs problems which are of a huge state space. Experiments demonstrates the validity of this representation method.

Key words: factored representation, Dynamic Bayesian Networks(DBNs), decision tree, algorithm of TD(λ)

中图分类号: