集群系统中的FP-Growth并行算法

doi:10.3969/j.issn.1000-3428.2009.20.024

计算机工程 ›› 2009, Vol. 35 ›› Issue (20): 71-72. doi: 10.3969/j.issn.1000-3428.2009.20.024

集群系统中的FP-Growth并行算法

陈敏1，李徽翡2

(1. 北京科技大学经济管理学院，北京 100083；2. 国际商业机器全球服务(中国)有限公司，北京 100027）

收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-10-20 发布日期:2009-10-20

FP-Growth Parallel Algorithm in Cluster System

CHEN Min1, LI Hui-fei2

(1. School of Economics and Management, University of Science and Technology Beijing, Beijing 100083; 2. IBM Global Services (China) Company Limited, Beijing 100027)

Received:1900-01-01 Revised:1900-01-01 Online:2009-10-20 Published:2009-10-20

摘要/Abstract

摘要： 针对FP-Growth算法面临大规模数据库时空效率不高的问题，提出一种面向计算机集群的并行算法。采用投影方法直接寻找频繁项的条件数据库，将挖掘条件数据库的工作分化成若干独立的子任务，分配到集群中的节点上并行实现，由中央节点汇总结果并输出。结果证明，该算法不仅能够提高计算速度，解决数据库规模过大时内存溢出的情况，且具有良好的延展性。

关键词: FP-Growth算法, 计算机集群, 并行算法

Abstract: When the dataset size is huge, both the memory usage and computational cost of FP-Growth algorithm are expensive. This paper proposes a parallel algorithm, which is designed to run on the PC cluster. This algorithm finds all the conditional pattern bases of frequent items by the projection method. It splits the mining task into number of independent sub-tasks, executes these sub-tasks in parallel on nodes and aggregates the sub-results back for the final result. Experiments show that this parallel algorithm not only can accelerate the computational speed, avoids the memory overflow, but also achieves much better scalability than the FP-Growth algorithm.

Key words: FP-Growth algorithm, PC cluster, parallel algorithm

中图分类号:

TP311.13

陈敏;李徽翡. 集群系统中的FP-Growth并行算法[J]. 计算机工程, 2009, 35(20): 71-72.

CHEN Min; LI Hui-fei. FP-Growth Parallel Algorithm in Cluster System[J]. Computer Engineering, 2009, 35(20): 71-72.

http://www.ecice06.com/CN/Y2009/V35/I20/71

[1]	肖汉, 郭宝云, 李彩林, 周清雷. 面向异构架构的传递闭包并行算法[J]. 计算机工程, 2021, 47(8): 131-139.
[2]	魏渐俊,陈良育. 基于GPGPU的大整数矩阵行列式快速准确计算方法[J]. 计算机工程, 2018, 44(3): 47-54.
[3]	张家齐,沈剑良,朱珂. FPGA并行时序驱动布局算法[J]. 计算机工程, 2017, 43(2): 98-104.
[4]	罗明,孟传伟,黄海量. 基于加权频繁模式树的通信网络告警规则挖掘方法[J]. 计算机工程, 2016, 42(4): 190-196.
[5]	陈振武,郑汉垣,兰添才,曾志宏. 求解大规模三对角线性方程组的GaBP并行算法[J]. 计算机工程, 2016, 42(10): 96-100.
[6]	凌海峰,刘超超. 基于MapReduce框架的并行蚁群优化聚类算法[J]. 计算机工程, 2015, 41(8): 168-173.
[7]	费雄伟,李肯立,阳王东. 基于CUDA 的AES 并行算法优化[J]. 计算机工程, 2014, 40(9): 6-12.
[8]	周诗慧, 殷建. Hadoop平台下的并行Web日志挖掘算法[J]. 计算机工程, 2013, 39(6): 43-46.
[9]	王怀超, 赵雷. 多核CPU/GPU平台下的集合求交算法[J]. 计算机工程, 2013, 39(4): 296-299,304.
[10]	张晟, 董荣胜, 冷文浩, 吴宴华. 分布式数据采集系统的通信模型优化[J]. 计算机工程, 2013, 39(4): 276-279.
[11]	夏龄, 舒涛. 一种H.264/AVC视频编码并行算法[J]. 计算机工程, 2013, 39(4): 314-317.
[12]	徐伟，王建，杨新. 一种心脏运动补偿算法的GPU实现[J]. 计算机工程, 2013, 39(11): 19-23,30.
[13]	李文敬, 钟智, 元昌安. 基于GEP的分形图像压缩并行算法[J]. 计算机工程, 2012, 38(7): 201-202.
[14]	黄跃峰, 钟耳顺. 多核平台并行单源最短路径算法[J]. 计算机工程, 2012, 38(3): 1-3.
[15]	吕明洲, 陈耀武. 基于异构多核处理器的H.264并行编码算法[J]. 计算机工程, 2012, 38(16): 35-39.

选择文件类型/文献管理软件名称

选择包含的内容

集群系统中的FP-Growth并行算法

FP-Growth Parallel Algorithm in Cluster System

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

集群系统中的FP-Growth并行算法

FP-Growth Parallel Algorithm in Cluster System

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价