作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2020, Vol. 46 ›› Issue (5): 150-156. doi: 10.19678/j.issn.1000-3428.0054747

• 先进计算与数据处理 • 上一篇    下一篇

基于差异节点集的加权频繁项集挖掘算法

王斌, 房新秀, 魏天佑   

  1. 青岛理工大学 信息与控制工程学院, 山东 青岛 266520
  • 收稿日期:2019-04-28 修回日期:2019-06-25 发布日期:2019-07-03
  • 作者简介:王斌(1963-),男,教授、博士,主研方向为知识发现、博弈论与应用;房新秀、魏天佑,硕士研究生。
  • 基金资助:
    国家自然科学基金(61502262)。

Weighted Frequent Itemsets Mining Algorithm Based on Difference Nodeset

WANG Bin, FANG Xinxiu, WEI Tianyou   

  1. School of Information and Control Engineering, Qingdao Universiy of Technology, Qingdao, Shandong 266520, China
  • Received:2019-04-28 Revised:2019-06-25 Published:2019-07-03

摘要: 针对基于WN-list的加权频繁项集挖掘算法NFWI挖掘效率低的问题,提出一种基于WDiffNodeset的加权频繁项集挖掘算法DiffNFWI。对DiffNodeset数据结构进行扩展得到WDiffNodeset,采用集合枚举树和混合搜索策略相结合的方法查找加权频繁项集,以避免大量的交集运算并实现高效查找。使用差集策略计算项集的加权支持度,从而降低计算量。在mushroom、pumsb等数据集上的实验结果表明,DiffNFWI算法的运行效率优于NFWI算法。

关键词: 加权频繁项集, 加权支持度, 集合枚举树, 混合搜索策略, 差集策略

Abstract: To address the low mining efficiency of NFWI,a WN-list based algorithm for weighted frequent itemsets mining,this paper proposes a WDiffNodeset-based weighted frequent itemsets mining algorithm,DiffNFWI.The algorithm extends the data structure of DiffNodeset to get WDiffNodeset,and then combines set enumeration tree with hybrid search strategy to find the weighted frequent itemsets,so as to reduce intersection operations and achieve efficient search.The difference set strategy is used to calculate the weighted support degree of the itemsets to reduce the amount of calculation.Experimental results show that the efficiency of the DiffNFWI algorithm is better than that of the NFWI algorithm on mushroom,pumsb and other datasets.

Key words: weighted frequent itemsets, weighted support degree, set-enumeration tree, hybrid search strategy, difference set strategy

中图分类号: