Abstract:
This paper analyzes classical k-means clustering algorithm, proves the way that generated complete global clustering information from the local clustering information in the case of minimizing the cost of communication between computing nodes. The clustering quality of the way is equivalent to the corresponding serial algorithm, and has higher efficiency in the implementation. On this basis, this paper gives a credible parallel k-means algorithm based on Message Passing Interface(MPI). Experimental result shows that the algorithm is efficient and feasible.
Key words:
clustering,
k-means algorithm,
parallel,
Message Passing Interface(MPI)
摘要: 对经典k均值算法进行分析,证明如何在减少节点间通信代价的情况下,从局部聚类信息生成完备的全局聚类信息,使聚类质量等价于相应串行算法,并具有较高的执行效率,在此基础上给出可信的基于消息传递接口的并行k均值算法。实验结果表明,该算法是高效的和可行的。
关键词:
聚类,
k均值算法,
并行,
消息传递接口
CLC Number:
DAO Ye, CENG Zhi-Yong, TU Jian-Kun, FENG Chao. Completeness Proof and Implementation of Parallel k-means Clustering Algorithm[J]. Computer Engineering, 2010, 36(22): 72-74.
陶冶, 曾志勇, 余建坤, 冯涛. 并行k均值聚类算法的完备性证明与实现[J]. 计算机工程, 2010, 36(22): 72-74.