Implementation of K-nearest Neighbor Algorithm Based on GPU

doi:10.3969/j.issn.1000-3428.2015.02.036

Abstract

Abstract: K-nearest Neighbor(KNN) is a classical problem whose computational complexity increases rapidly with the size of data set. It is an interesting research to accelerate KNN implementation on the Graphics Processor Unit(GPU) by employing GPU’s massive parallel computing power. For its heavy overhead on time,after analyzing the existing work of GPU-based KNN implementations and the architectural features of GPU,this paper efficiently parallelizes KNN on the GPU. It optimizes data access by making good use of the coalesced access power of global memory,and reduces thread serialization by filtering out as much data as possible in advance that is to be sorted. Experiments on KDD,Poker and Covertype datasets and comparisons with some existing methods show that the number of floating point arithmetic of executed per second of this distance computing method is up to 266. 37 × 109 ,and is up to 26. 47 × 109 in sort phase, which are superior to that of existed methods.

Key words: K-nearest Neighbor ( KNN) problem, Graphics Processing Unit ( GPU), parallel computing, algorithm acceleration, coalesced access, global memory

摘要： K-近邻计算在数据集规模较大时计算复杂度较高,因此,利用图形处理器(GPU)强大的并行计算能力对K-近邻算法进行加速。在分析现有K-近邻算法的基础上,针对该算法时间开销过大的问题,结合GPU 的体系结构特征实现基于GPU 的K-近邻算法。利用全局存储器的合并访问特性,提高GPU 全局存储器访问数据的效率,通过事先过滤数据的方法来减少参与排序的数据量,进而减少排序阶段的线程串行化时间。在KDD,Poker, Covertype 3 个数据集上进行实验, 结果表明, 该实现方法在距离计算阶段每秒执行的浮点运算次数为266. 37 ×109 次,而排序阶段为26. 47 ×109 次,优于已有方法。

关键词: K-近邻问题, 图形处理器, 并行计算, 算法加速, 合并访问, 全局存储器

CLC Number:

TP311

TIAN Pan,HUA Bei,LU Li. Implementation of K-nearest Neighbor Algorithm Based on GPU[J]. Computer Engineering.

田盼,华蓓,陆李. 基于GPU 的K-近邻算法实现[J]. 计算机工程.

/ Recommend / Download Citations

URL:

https://www.ecice06.com/EN/Y2015/V41/I2/189

References

参考文献 [ 1 ]　Chen M C,Wang R J,Chen A P. An Empirical Study for the Detection of Corporate Financial Anomaly Using Outlier Mining Techniques [ C ] / / Proceedings of International Conference on Convergence Information Technology. [S. l. ]:IEEE Press,2007:612-617. [ 2 ]　Lazarevic A,Ert?z L,Kumar V,et al. A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection [ C ] / / Proceedings of SDM ’ 03. Chicago,USA:[s. n. ],2003:25-36. (下转第198 页) [ 3 ]　Guttormsson S E,Marks R J,El-Sharkawi M A,et al. Elliptical Novelty Grouping for Online Short-turn Detection of Excited Running Rotors [ J ]. IEEE Transactions on Energy Conversion, 1999, 14 ( 1 ): 16-22. [ 4 ]　Breunig M M, Kriegel H P, Ng R T, et al. LOF: Identifying Density-based Local Outliers [ C ] / / Proceedings of ACM SIGMOD International Conference on Management of Data. New York, USA: [ s. n. ], 2000:93-104. [ 5 ]　Alshawabkeh M,Jang B,Kaeli D. Accelerating the Local Outlier Factor Algorithm on a GPU for Intrusion Detection Systems [ C ] / / Proceedings of the 3rd Workshop on General-purpose Computation on Graphics Processing Units. [S. l. ]:ACM Press,2010:104-110. [ 6 ]　Arya S,Mount D M,Netanyahu N S,et al. An Optimal Algorithm for Approximate Nearest Neighbor Searching Fixed Dimensions [ J ]. Journal of the ACM, 1998, 45(6):891-923. [ 7 ]　Garcia V, Debreuve E, Barlaud M. Fast K Nearest Neighbor Search Using GPU[C] / / Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. [S. l. ]: IEEE Press, 2008:1-6. [ 8 ]　NVIDIA Co. ,CUDA Zone [EB / OL]. (2010-11-21). http:/ / www. nvidia. com / object / cuda home. html. [ 9 ]　Dudek R, Cuenca C, Quintana F. Accelerating Space Variant Gaussian Filtering on Graphics Processing Unit[M]. Berlin,Germany:Springer Press,2007. [10]　程　豪,张云泉,张先轶,等. CPU-GPU 并行矩阵乘法的实现与性能分析[J]. 计算机工程,2010,36(13): 24-26,29. [11]　陈　鹏,曹剑炜,陈庆奎. 基于GPU 的H. 264 并行解码算法[J]. 计算机工程,2014,40(1):283-286. [12]　NVIDIA Co. ,CUDA 2. 1 Programming Guide[EB/ OL]. (2008-11-21). http:/ / www. nvidia. com/ object / cudadeve lop. html. [13]　Kato K,Hosino T. Solving K-nearest Neighbor Problem on Multiple Graphics Processors [ C] / / Proceedings of the 10th IEEE / ACM International Conference on Cluster, Cloud and Grid Computing. [ S. l. ]: IEEE Computer Society,2010:769-773. [14]　Asuncion A, Newman D. UCI Machine Learning Repository [ J]. Knowledge and Information Systems, 2007,18(1):1-4. 编辑　刘　冰

[1]	ZHANG Lei, ZHAO Guangyue, XIAO Chaoen, WANG Jianxin. GPU Parallel Optimal Design and Implementation of Key Tree Generation Components for Falcon Post-Quantum Algorithms [J]. Computer Engineering, 2024, 50(9): 208-215.
[2]	YANG Tailong, ZHAO Hongpeng, ZHANG Lei. Singular Value Decomposition Method Based on Domestic Heterogeneous Platforms [J]. Computer Engineering, 2024, 50(9): 216-225.
[3]	WANG Qihan, PANG Jianmin, YUE Feng, ZHU Di, SHEN Li, XIAO Qian. Implementation and Optimization of Parallel KNN Algorithm for Sunway Architecture [J]. Computer Engineering, 2023, 49(5): 286-294.
[4]	XIA Libin, LIU Xiaoyu, JIANG Xiaowei, SUN Gongxing. Memory Optimization Method for Parallel Computing Framework Based on Distributed Dataset [J]. Computer Engineering, 2023, 49(4): 43-51.
[5]	LIN Lin, ZHU Aiqi, ZHAO Mingcan, ZHANG Shuai, YE Yanhao, XU Ji, HAN Lin, ZHAO Rongcai, HOU Chaofeng. GPU-Accelerated Algorithm Optimization for Molecular Dynamics Simulation of Crystalline Silicon [J]. Computer Engineering, 2023, 49(4): 166-173.
[6]	Jun FANG, Xiaodong XUE, Yunliang ZHOU. Aggregated Query Interval Estimation Method Based on Depth Generative Model [J]. Computer Engineering, 2023, 49(11): 284-292, 301.
[7]	LI Jinguo, JIAO Xubin. Research on Intrusion Detection Model in Fog Computing Environment [J]. Computer Engineering, 2022, 48(5): 43-52.
[8]	HUANG Rui, JIN Guanghao, LI Lei, JIANG Wenchao, SONG Qingzeng. Design and Implementation of Accelerator for Lightweight Neural Network [J]. Computer Engineering, 2021, 47(9): 185-190,196.
[9]	YI Peihuai, LI Weidong, LIN Tao, ZOU Jiaheng, DENG Ziyan, LIU Yan. Application of GPU in Fast Muon Simulation [J]. Computer Engineering, 2021, 47(8): 100-108.
[10]	SHE Xin, HE Zhenying. Spark-based clique Community Search Algorithm Under Complex Attribute Condition [J]. Computer Engineering, 2021, 47(12): 54-61,70.
[11]	GUO Yuluo, BIAN Haodong, DONG Runting, TANG Jiahao, WANG Xiaoying, HUANG Jianqiang. Parallel Fourier Space Image Similarity Calculation Based on SIMD [J]. Computer Engineering, 2021, 47(11): 247-253.
[12]	XIAO Chenglong, NIE Ziyang, WANG Ning, ZHANG Zhongpeng, WANG Shanshan. Research on Maximum Clique Identification Based on Parallel Constraint Programming [J]. Computer Engineering, 2020, 46(4): 53-59,69.
[13]	XU Guowei, CHEN Jian, CHENG Yi. Research on Radar Clutter Simulation Based on GPU Parallel Computing [J]. Computer Engineering, 2020, 46(11): 306-314.
[14]	LI Jie, ZHU Hongliang, CHEN Yuling, XIN Yang. Improved Parallel Apriori Algorithm Based on Hash Storage and Transaction Weighting [J]. Computer Engineering, 2020, 46(11): 109-116.
[15]	SONG Kuangshi, LI Chong, ZHANG Shibo. Design and Implementation of a Lightweight Distributed Machine Learning System [J]. Computer Engineering, 2020, 46(1): 201-207.

Please choose a citation manager

Content to export