作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

一种事件序列的加权变阶马尔可夫模型

吴宏和,陈黎飞,郭躬德   

  1. (福建师范大学数学与计算机科学学院,福州 350007)
  • 收稿日期:2013-07-11 出版日期:2014-04-15 发布日期:2014-04-14
  • 作者简介:吴宏和(1987-),男,硕士研究生,主研方向:数据挖掘;陈黎飞,副教授、博士;郭躬德,教授、博士。
  • 基金资助:
    国家自然科学基金资助项目(61175123)。

A Weighted Variable Order Markov Model for Event Sequences

WU Hong-he, CHEN Li-fei, GUO Gong-de   

  1. (School of Mathematics and Computer Science, Fujian Normal University, Fuzhou 350007, China)
  • Received:2013-07-11 Online:2014-04-15 Published:2014-04-14

摘要: 变阶马尔可夫模型是对事件序列建模的一种简单且有效的模型,但经典变阶马尔可夫模型只考虑转移概率,未关注子序列本身出现的频率。为此,提出一种加权的变阶马尔可夫模型,在经典变阶马尔可夫模型基础上根据子序列的频率构建一棵加权概率后缀树。给出一种剪枝策略,在构建后缀树时根据结点相似程度剪除树枝,以提高模型的泛化能力,并在线性时间内完成加权概率后缀树的构建。通过将加权的模型应用于事件序列分类进行实验验证,结果表明,该模型可以对不同领域的实际序列数据进行有效分类。

关键词: 变阶马尔可夫模型, 概率后缀树, 事件序列, 分类, 加权, 剪枝

Abstract: Variable-order Markov Model(VLMM) is a simple but effective model for event sequences modeling. However, the classic VLMM only considers the transition probability, without taking into account the frequency of the substring. A Weighted VLMM(WVLMM) is proposed in this paper, constructing a Weighted Probabilistic Suffix Tree(WPST) via the frequency of the substring based on the classic VLMM. It also proposes a strategy for branches pruning based on the degree of the similarity of the nodes while constructing the tree, in order to improve the generalization ability of the model, and to construct the tree in a linear time complexity. To validate the effectiveness of the model, the proposed model is applied to the classification of event sequences. Experimental results demonstrate that the new model can make an effective classification on real-world sequence datasets in different applications.

Key words: Variable-order Markov Model(VLMM), probabilistic suffix tree, event sequence, classification, weighted, pruning

中图分类号: