作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

动态云平台下的快速闭树聚类并行算法

郭 鑫,颜一鸣,徐洪智,覃遵跃   

  1. (吉首大学软件服务外包学院,湖南 张家界 427000)
  • 收稿日期:2012-09-05 出版日期:2013-09-15 发布日期:2013-09-13
  • 作者简介:郭 鑫(1984-),男,助教、硕士,主研方向:数据挖掘,并行计算;颜一鸣、徐洪智,讲师;覃遵跃,副教授
  • 基金资助:
    湖南省教育厅基金资助一般项目(10C1100);吉首大学校级科研计划基金资助项目(11JD051)

Fast Closed Tree Clustering Parallel Algorithm for Dynamic Cloud Platform

GUO Xin, YAN Yi-ming, XU Hong-zhi, QIN Zun-yue   

  1. (School of Software Service Outsourcing, Jishou University, Zhangjiajie 427000, China)
  • Received:2012-09-05 Online:2013-09-15 Published:2013-09-13

摘要: 为提高聚类算法效率,提出一种基于动态云平台的快速闭树聚类并行算法。针对云计算平台Hadoop中任务的随机分配策略,给出一个满足最小化消耗成本的任务分配算法CDA-GA,并基于该算法提出动态云平台模型。将传统的频繁闭树挖掘算法与聚类算法并行化,应用于动态云平台中,设计基于动态云平台的闭树聚类算法框架。实验结果表明,该算法有效可行,适合在大规模数据下进行聚类分析。

关键词: 数据挖掘, 云计算, 并行计算, 闭树, 树聚类, 海量数据

Abstract: In order to improve the efficiency of clustering algorithm, this paper proposes a model of fast closed tree paralleled algorithm on the platform of dynamic cloud. Aiming at the random allocation strategy of cloud computing platform Hadoop, the paper puts forward CDA-GA to meet the requirements of the minimized consumption cost. Moreover, on the foundation of CDA-GA, it proposes the dynamic cloud platform model. The parallelization of traditional frequency closed tree mining algorithm and clustering algorithm and is applied in the dynamic cloud platform, this paper designs a closed tree clustering algorithm framework. Experimental results show that the algorithm is feasible and fits into clustering analysis under massive amounts of data.

Key words: data mining, cloud computing, parallel computing, closed tree, tree clustering, mass data

中图分类号: