Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

Previous Articles     Next Articles

Fast Closed Tree Clustering Parallel Algorithm for Dynamic Cloud Platform

GUO Xin, YAN Yi-ming, XU Hong-zhi, QIN Zun-yue   

  1. (School of Software Service Outsourcing, Jishou University, Zhangjiajie 427000, China)
  • Received:2012-09-05 Online:2013-09-15 Published:2013-09-13

动态云平台下的快速闭树聚类并行算法

郭 鑫,颜一鸣,徐洪智,覃遵跃   

  1. (吉首大学软件服务外包学院,湖南 张家界 427000)
  • 作者简介:郭 鑫(1984-),男,助教、硕士,主研方向:数据挖掘,并行计算;颜一鸣、徐洪智,讲师;覃遵跃,副教授
  • 基金资助:
    湖南省教育厅基金资助一般项目(10C1100);吉首大学校级科研计划基金资助项目(11JD051)

Abstract: In order to improve the efficiency of clustering algorithm, this paper proposes a model of fast closed tree paralleled algorithm on the platform of dynamic cloud. Aiming at the random allocation strategy of cloud computing platform Hadoop, the paper puts forward CDA-GA to meet the requirements of the minimized consumption cost. Moreover, on the foundation of CDA-GA, it proposes the dynamic cloud platform model. The parallelization of traditional frequency closed tree mining algorithm and clustering algorithm and is applied in the dynamic cloud platform, this paper designs a closed tree clustering algorithm framework. Experimental results show that the algorithm is feasible and fits into clustering analysis under massive amounts of data.

Key words: data mining, cloud computing, parallel computing, closed tree, tree clustering, mass data

摘要: 为提高聚类算法效率,提出一种基于动态云平台的快速闭树聚类并行算法。针对云计算平台Hadoop中任务的随机分配策略,给出一个满足最小化消耗成本的任务分配算法CDA-GA,并基于该算法提出动态云平台模型。将传统的频繁闭树挖掘算法与聚类算法并行化,应用于动态云平台中,设计基于动态云平台的闭树聚类算法框架。实验结果表明,该算法有效可行,适合在大规模数据下进行聚类分析。

关键词: 数据挖掘, 云计算, 并行计算, 闭树, 树聚类, 海量数据

CLC Number: