数据流最大频繁项挖掘方法

doi:10.3969/j.issn.1000-3428.2010.22.030

计算机工程 ›› 2010, Vol. 36 ›› Issue (22): 86-87. doi: 10.3969/j.issn.1000-3428.2010.22.030

数据流最大频繁项挖掘方法

张月琴，陈东

(南京工业大学电子与信息工程学院，南京 210009)

出版日期:2010-11-20 发布日期:2010-11-18
作者简介:张月琴(1975－)，女，讲师、硕士，主研方向：人工智能，数据挖掘；陈东，讲师、硕士
基金资助:
南京工业大学青年教师学术基金资助项目(39709013)

Mining Method of Data Stream Maximum Frequent Itemsets

ZHANG Yue-qin, CHEN Dong

(College of Electronic and Information Engineering, Nanjing University of Technology, Nanjing 210009, China)

Online:2010-11-20 Published:2010-11-18

摘要/Abstract

摘要：

提出基于事务矩阵挖掘最大频繁项集的方法AFMI，该方法采取迭代精简事务矩阵的方式求解所有事务中的最大频繁项集，从精简后的事务向量交集的子集中搜索最大频繁项集，并运用逻辑运算和剪枝方法提高挖掘效率。基于AFMI方法，研究挖掘滑动窗口数据流最大频繁项集算法AFMI+，该算法可使用户周期性地挖掘当前窗口中的最大频繁项集。实验结果表明，AFMI和AFMI+算法均具有较好的性能。

关键词: 数据挖掘, 数据流, 滑动窗口, 最大频繁项集, 矩阵

Abstract:

A method called AFMI based on a transaction matrix is proposed to mine the maximum frequent itemsets. The frequent itemsets are obtained from all the transactions by means of condensing iteratively the transaction matrix, the transaction vector intersections are acquired to reduce the range of search. Logical operations and pruning methods are adopted to improve the efficiency of the mining. Based on AFMI, an algorithm called AFMI+ is proposed, which can mine maximum frequent itemsets from a sliding window over data streams. AFMI+ can get the maximum frequent itemsets in current sliding window over data streams just when users need to get them periodically. Experimental result shows that AFMI and AFMI+ algorithms have better performance.

Key words: data mining, data stream, sliding window, maximum frequent itemsets, matrix

中图分类号:

TP311

张月琴, 陈东. 数据流最大频繁项挖掘方法[J]. 计算机工程, 2010, 36(22): 86-87.

ZHANG Ru-Qin, CHEN Dong. Mining Method of Data Stream Maximum Frequent Itemsets[J]. Computer Engineering, 2010, 36(22): 86-87.

http://www.ecice06.com/CN/Y2010/V36/I22/86

[1]	陈君航, 杨祖元, 刘名扬, 李陵江. 基于正交约束的广义可分离非负矩阵分解算法[J]. 计算机工程, 2023, 49(8): 46-53.
[2]	刘波, 李小霞, 秦佳敏, 周颖玥. 全局相关块级自注意力的食管癌前病变区域分割[J]. 计算机工程, 2023, 49(7): 313-320.
[3]	惠子薇, 何坤, 冯犇, 苏曜. 基于视觉特性的图像质量评价[J]. 计算机工程, 2023, 49(7): 189-195.
[4]	徐怡, 侯迪. 基于矩阵的粗糙集近似集快速计算算法[J]. 计算机工程, 2023, 49(5): 22-28.
[5]	席荣康, 蔡满春, 芦天亮. 基于数据增强与流数据处理的Tor流量分析模型[J]. 计算机工程, 2023, 49(3): 177-184.
[6]	胡慧旗, 张维强, 徐晨. 判别性增强的稀疏子空间聚类[J]. 计算机工程, 2023, 49(2): 98-104.
[7]	李林珂, 康昭, 龙波. 基于黎曼流形的多视角谱聚类算法[J]. 计算机工程, 2023, 49(1): 113-120,129.
[8]	刘杭, 殷歆, 陈杰, 罗恒. 基于混合网络模型的多维时间序列预测[J]. 计算机工程, 2023, 49(1): 121-129.
[9]	王富平, 于俊涛, 张锲石. 基于自适应方向导数滤波器的彩色边缘检测[J]. 计算机工程, 2022, 48(9): 204-212.
[10]	邱鸿辉, 刘海林, 陈磊. 基于协方差矩阵调整的多目标多任务优化算法[J]. 计算机工程, 2022, 48(8): 306-312.
[11]	贺娜, 马盈仓. 融合KL信息的多视图模糊聚类算法[J]. 计算机工程, 2022, 48(7): 114-121,150.
[12]	王晞阳, 陈继林, 李猛, 刘首文. FPGA架构上面向稀疏矩阵求解的静态调度算法[J]. 计算机工程, 2022, 48(7): 199-205,213.
[13]	范林歌, 武欣嵘, 童玮, 曾维军. 基于概率矩阵分解的不完整数据集特征选择方法[J]. 计算机工程, 2022, 48(6): 57-64.
[14]	冉懿, 王润年, 潘红伟, 俞海猛, 袁培森. 面向停电分类预测的因子分解机模型[J]. 计算机工程, 2022, 48(5): 98-103,111.
[15]	朱黎明, 丁晓波, 龚国强. 图数据连续发布中的隐私保护方法[J]. 计算机工程, 2022, 48(5): 154-161.

选择文件类型/文献管理软件名称

选择包含的内容

数据流最大频繁项挖掘方法

Mining Method of Data Stream Maximum Frequent Itemsets

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

数据流最大频繁项挖掘方法

Mining Method of Data Stream Maximum Frequent Itemsets

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价