作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于点的FO-POMDP值迭代方法研究

陈丽娜,黄宏斌,邓 苏   

  1. (国防科学技术大学信息系统工程重点实验室,长沙 410073)
  • 收稿日期:2012-05-14 出版日期:2013-10-15 发布日期:2013-10-14
  • 作者简介:陈丽娜(1983-),女,博士研究生,主研方向:智能决策;黄宏斌,副教授、博士;邓 苏,教授、博士生导师
  • 基金资助:
    国家自然科学基金资助项目(71071160)

Research on Point-based Value Iteration Method for FO-POMDP

CHEN Li-na, HUANG Hong-bin, DENG Su   

  1. (Key Laboratory of Information System Engineering, National University of Defense Technology, Changsha 410073, China)
  • Received:2012-05-14 Online:2013-10-15 Published:2013-10-14

摘要: 在部分可观测马尔可夫决策过程(POMDP)的基础上,给出一阶部分可观测马尔科夫决策过程(FO-POMDP),用一阶逻辑的情景演算结构表达POMDP。对FO-POMDP模型中状态的抽象层次进行刻画,提出状态粒度、信念状态粒度的概念。采用粒度归结方法,将信念状态的粒度归结到某一确定粒度下,运用确定粒度下的信念点距离度量方法,将基于点的价值迭代(PBVI)扩展到逻辑抽象层面提出一阶PBVI(FO-PBVI)。实验结果证明,该算法的求解速度较快,求解质量较好。

关键词: 部分可观测马尔科夫决策过程, 状态空间, 信念状态, 粒度归结, 基于点的值迭代

Abstract: This paper presents the First Order-Partially Observable Markov Decision Processes(FO-POMDP), which is a logical expression of POMDP using situation calculus. And the level of abstraction is an important problem for solving the FO-POMDP. The concept of the granularity of states and the granularity of belief states are proposed. The level of abstraction can be characterized by the granularity. The method of granularity resolution can convert the granularity of belief states. And the distance of different belief states is also presented. The Point-based Value Iteration(PBVI) is extended to the logic level. Experimental results show that the solving speed of this algorithm is faster, and is of better quality.

Key words: Partially Observable Markov Decision Processes(POMDP), state space, belief state, granularity resolution, Point-based Value Iteration(PBVI)

中图分类号: