作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (14): 62-64. doi: 10.3969/j.issn.1000-3428.2011.14.019

• 软件技术与数据库 • 上一篇    下一篇

最大亚频繁模式挖掘算法研究

张海清 1,刘胤田 1,2   

  1. (1. 成都信息工程学院智能信息处理实验室,成都 610225; 2. 四川大学数学学院,成都 610065)
  • 收稿日期:2010-12-07 出版日期:2011-07-20 发布日期:2011-07-20
  • 作者简介:张海清(1986-),女,硕士研究生,主研方向:数据挖掘;刘胤田(通讯作者),副教授、博士
  • 基金资助:

    国家自然科学基金资助项目(60773169, 60702075)

Research on Mining Algorithm of Maximal Sub-Frequent Pattern

ZHANG Hai-qing 1, LIU Yin-tian 1,2   

  1. (1. Intelligent Information Processing Lab, Chengdu University of Information Technology, Chengdu 610225, China; 2. College of Mathematics, Sichuan University, Chengdu 610065, China)
  • Received:2010-12-07 Online:2011-07-20 Published:2011-07-20

摘要:

为解决传统最大频繁模式在项集频繁度与项集长度规模之间的制约关系,提出最大亚频繁模式概念及其挖掘算法MSFP-mining,包括最大亚频繁模式概念并分析其要素特点,基于AFP-tree、CMP-tree、SFP-tree、SFP-growth的候选MSFP挖掘方法,基于MSFP-tree的最大亚频繁模式超集检测和剪枝策略及对MSFP-mining挖掘性能的实验验证。实验结果表明,该算法利用差别频繁度实现核心项集、附加频繁项集、补充频繁项集的阶段性求取和组合,在保证项集频繁度基础上实现最大亚频繁模式挖掘,扩展频繁模式规模。

关键词: 模式挖掘, 最大亚频繁模式, 数据集, 超集检测, MSFP-tree结构

Abstract:

To solve the problem of traditional maximal frequent pattern mining that it can not find frequent pattern remaining more items than traditional maximal frequent pattern with the same support threshold, this paper proposes the conception of Maximal Sub-Frequent Pattern(MSFP) and relative mining algorithm MSFP-mining. The main contributions include: the conception of MSFP and analysis of MSFP character, the MSFP-mining algorithms of MSFP, such as AFP-tree, CMP-tree, SFP-tree, SFP-growth, and MSFP-tree, the superset check method of candidate MSFP and the pruning strategy of MSFP-tree, the efficiency of MSFP-tree based mining algorithms by extensive experiments. Experimental result shows that MSFP can effectively expand the scale of maximal frequent pattern.

Key words: pattern mining, Maximal Sub-Frequent Pattern(MSFP), data set, superset check, MSFP-tree structure

中图分类号: