Parallel Implementation of K-means Algorithm with Batch Processing

doi:10.3969/j.issn.1000-3428.2012.13.043

Computer Engineering ›› 2012, Vol. 38 ›› Issue (13): 145-147,151.

• Networks and Communications • Previous Articles Next Articles

Parallel Implementation of K-means Algorithm with Batch Processing

LAN Yuan-dong, LIU Yu-fang, XU Tao

(Department of Computer Science, Huizhou University, Huizhou 516007, China)

Received:2011-10-18 Online:2012-07-05 Published:2012-07-05

分批处理的K-means算法并行实现

兰远东，刘宇芳，徐涛

(惠州学院计算机科学系，广东惠州 516007)

作者简介:兰远东(1975－)，男，博士研究生，主研方向：模式识别，机器学习；刘宇芳，副教授；徐涛，博士研究生
基金资助:
国家“863”先进制造领域基金资助重点项目(2006AA04A120)；广东高校优秀青年创新人才培养计划基金资助项目(LYM09128)

Abstract

Abstract: K-means algorithm is computationally intensive, time consuming and convergence slow. In order to solve the problem of K-means algorithm, a new set of parallel solution of K-means algorithm is presented. In the General Purpose computation on Graphics Processing Unit(GPGPU) architecture, Compute Unified Device Architecture(CUDA) is used to accelerate K-means algorithm. Based on batch principle, the algorithm uses CUDA’s memory more rationally, to avoid access conflict, reduce the number of times of visits for data sets, and improve the efficiency of K-means algorithm. Experimental result in large-scale data set shows that the algorithm has a faster clustering speed.

Key words: data mining, K-means algorithm, Compute Unified Device Architecture(CUDA), parallel algorithm, clustering analysis, Graphics Processing Unit(GPU)

摘要： 为解决K-means 算法计算量大、收敛缓慢、运算耗时长等问题，给出一种新的K-means算法的并行实现方法。在通用计算图形处理器架构上，使用统一计算设备架构(CUDA)加速K-means算法。采用分批原则，更合理地运用CUDA提供的各种存储器，避免访问冲突，同时减少对数据集的访问次数，以提高算法效率。在大规模数据集中的实验结果表明，该算法具有较快的聚类速度。

关键词: 数据挖掘, K-means算法, 统一计算设备架构, 并行算法, 聚类分析, 图形处理器

CLC Number:

TP301.6

LAN Yuan-Dong, LIU Yu-Fang, XU Chao. Parallel Implementation of K-means Algorithm with Batch Processing[J]. Computer Engineering, 2012, 38(13): 145-147,151.

兰远东, 刘宇芳, 徐涛. 分批处理的K-means算法并行实现[J]. 计算机工程, 2012, 38(13): 145-147,151.

/ Recommend / Download Citations

URL:

https://www.ecice06.com/EN/Y2012/V38/I13/145

[1]	LIU Jinshuo, WEN Yao. Auto-Generation and Auto-Tuning Framework of Stencil Operation Code [J]. Computer Engineering, 2024, 50(6): 35-47.
[2]	WU Zhengjiang, LÜ Chenggong, WANG Mengsong. Calculation Method for Semi-Monolayer Covering Approximation Sets Fushing GPU [J]. Computer Engineering, 2024, 50(5): 71-82.
[3]	Bin HUANG, Anjun LIU, Jingshan PAN, Min TIAN, Yu ZHANG, Guanghui ZHU. GPU-based Algorithm Optimization for Streaming Module of Lattice Boltzmann Method [J]. Computer Engineering, 2024, 50(2): 232-238.
[4]	Liangshan SHAO, Songze ZHAO. Fractional Imputation Algorithm for Incomplete Data Based on Multi-Model Fusion [J]. Computer Engineering, 2023, 49(9): 79-88, 98.
[5]	DAI Haolei, HUANG Yonghui, ZHOU Guoxu. Clustering Analysis Based on Hyper-graph Regularized Non-Negative Tensor Train Decomposition [J]. Computer Engineering, 2023, 49(6): 81-89.
[6]	XI Rongkang, CAI Manchun, LU Tianliang. Tor Traffic Analysis Model Based on Data Enhancement and Stream Data Processing [J]. Computer Engineering, 2023, 49(3): 177-184.
[7]	LI Jing, ZHU Aiqi, HAN Lin, HOU Chaofeng. Optimization of Molecular Dynamics Algorithm for Solid Crystalline Silicon Based on GPU [J]. Computer Engineering, 2023, 49(3): 288-295.
[8]	GU Qingzhu, DONG Hongbin. MI Loss Evaluation Model for k-Anonymity in PPDM [J]. Computer Engineering, 2022, 48(4): 143-147.
[9]	CHEN Luyao, LIU Qilong, XU Yunxia, CHEN Zhen. Image Clustering Algorithm Based on Hypergraph Regularized Nonnegative Tucker Decomposition [J]. Computer Engineering, 2022, 48(4): 197-205.
[10]	WANG Lu, LIU Xiaoqing, HE Zhenying. Frequent Word Sequence Mining Algorithm in Continuous Time Interval [J]. Computer Engineering, 2022, 48(2): 79-85,91.
[11]	ZHANG Pan, GAO Feng, ZHOU Yi, RAO Hanyu, MAO Dong, LI Jing. An Online Real-Time Anomaly Detection Method for Microservice Call Chains [J]. Computer Engineering, 2022, 48(11): 161-169.
[12]	WU Jun, OUYANG Aijia, ZHANG Lin. Redundant Contrast Pattern Filtering Algorithm for Permutation Testing [J]. Computer Engineering, 2022, 48(1): 75-84.
[13]	ZHANG Kun, JIA Jinfang, YAN Wenxin, HUANG Jianqiang, WANG Xiaoying. Parallel Solution and Optimization of Large-Scale Sparse Linear System in GRAPES Dynamic Framework [J]. Computer Engineering, 2022, 48(1): 149-154,162.
[14]	WU Jun, OUYANG Aijia, ZHANG Lin. Independent Exact Permutation Testing Algorithm for Distinguishing Sequential Pattern Discovery [J]. Computer Engineering, 2021, 47(8): 45-53,61.
[15]	XIAO Han, GUO Baoyun, LI Cailin, ZHOU Qinglei. Parallel Transitive Closure Algorithm for Heterogeneous Architecture [J]. Computer Engineering, 2021, 47(8): 131-139.

Please choose a citation manager

Content to export

Parallel Implementation of K-means Algorithm with Batch Processing

分批处理的K-means算法并行实现

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments

模态框（Modal）标题

Please choose a citation manager

Content to export

Parallel Implementation of K-means Algorithm with Batch Processing

分批处理的K-means算法并行实现

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments