作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (5): 76-78. doi: 10.3969/j.issn.1000-3428.2010.05.028

• 软件技术与数据库 • 上一篇    下一篇

基于高性能云的分布式数据挖掘方法

桂兵祥,何 健   

  1. (武汉工业学院计算机与信息工程系,武汉 430023)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-03-05 发布日期:2010-03-05

Distributed Data Mining Approach with High Performance Cloud

GUI Bing-xiang, HE Jian   

  1. (Department of Computer and Information Engineering, Wuhan Polytechnic University, Wuhan 430023)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-03-05 Published:2010-03-05

摘要: 为实现数据在同一个地点进行处理而无须移动,介绍一种基于高性能云的分布式数据并行处理方法。使用一个专用的网络服务分层结构,适用于高性能广域网络连接的计算机集群所产生的大型分布式数据集的数据挖掘。实验结果表明,与Hadoop方法相比,该方法的性能有显著提高。

关键词: 存储云, 计算云, 分布式数据并行处理方法, 数据挖掘

Abstract: This paper presents a distributed data mining approach with high performance cloud, prescribes its structure design briefly. The design of this cloud results in the data is able to process frequently in one place without moving it. With the special layered service structure, this data mining parallelism can be used for mining large distributed data sets over clusters connected with high performance wide area networks. Experimental results show that the data parallel approach is better than Hadoop method.

Key words: storage cloud, compute cloud, distributed data parallelism processing approach, data mining

中图分类号: