作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2018, Vol. 44 ›› Issue (11): 14-18,26. doi: 10.19678/j.issn.1000-3428.0048605

• 先进计算与数据处理 • 上一篇    下一篇

基于Hadoop平台的GPU集群加速Apriori算法

瞿诗齐,刘少江,倪伟传,余庆茂   

  1. 中山大学 新华学院,广州 510520
  • 收稿日期:2017-09-08 出版日期:2018-11-15 发布日期:2018-11-15
  • 作者简介:瞿诗齐(1992—),男,助教、硕士,主研方向为数据挖掘;刘少江,助理实验师、硕士;倪伟传,本科生;余庆茂,工程师。

Accelerating Apriori Algorithm of GPU Cluster Based on Hadoop Platform

QU Shiqi,LIU Shaojiang,NI Weichuan,YU Qingmao   

  1. Xinhua College,Sun Yat-sen University,Guangzhou 510520,China
  • Received:2017-09-08 Online:2018-11-15 Published:2018-11-15

摘要:

针对Apriori算法在Hadoop平台下集群节点计算能力有限的问题,将并行能力较大的GPU与Hadoop相结合,提出一种GPU-Hadoop的计算结构算法。通过Hadoop平台的MapReduce框架,节点将Apriori算法的计算密集型任务交由GPU进行处理,以缩减运算时间。实验结果表明,改进Apriori算法在面对大规模数据集时具有较高的执行速度与计算效率。

关键词: Apriori算法, Hadoop平台, 集群节点, 密集型任务, 大规模数据集

Abstract:

Aiming at the problem that the Apriori algorithm has limited computing power in the Hadoop platform,the GPU with Hadoop is combined with Hadoop,and a computing structure of GPU-Hadoop is proposed.Through the MapReduce framework of the Hadoop platform,nodes deliver the computationally intensive tasks of the Apriori algorithm to the GPU for processing to reduce computation time.Experimental results show that the improved Apriori algorithm has better execution speed and computational efficiency in the face of large-scale dataset.

Key words: Apriori algorithm, Hadoop platform, cluster node, intensive task, large-scale dataset

中图分类号: