摘要:
为满足大规模空间数据库的聚类需求,面向计算机集群,提出一种基于密度的并行聚类算法。该算法根据数据库分布特征进行数据分区,在每一个节点上对数据块并行聚类,在主节点上合并聚类结果。实验结果表明,该算法的计算速度随着节点数的增多呈线性增加,具有较好的延展性。
关键词:
并行聚类,
计算机集群,
数据库,
延展性
Abstract:
In order to meet the demands for large scale databases clustering, this paper proposes a parallel clustering algorithm based on density for computer colony. This algorithm goes on data partition according to database distribution feature, processes data block parallel clustering on every node, merges clustering result on main node. Experimental result shows that computing speed of this algorithm is linear increment with number of node increasing, and it has better extensibility.
Key words:
parallel clustering,
computer colony,
database,
extensibility parallel clustering,
computer colony,
database,
extensibility parallel clustering,
computer colony,
database,
extensibility
中图分类号:
陈敏, 高学东, 栾绍峻, 郗玉平. 基于密度的并行聚类算法[J]. 计算机工程, 2010, 36(11): 8-10.
Chen-Min, GAO Hua-Dong, LUAN Chao-Jun, XI Yu-Beng. Parallel Clustering Algorithm Based on Density[J]. Computer Engineering, 2010, 36(11): 8-10.