基于深度学习的稀疏矩阵向量乘运算性能预测模型

doi:10.19678/j.issn.1000-3428.0060481

计算机工程 ›› 2022, Vol. 48 ›› Issue (2): 86-91. doi: 10.19678/j.issn.1000-3428.0060481

基于深度学习的稀疏矩阵向量乘运算性能预测模型

曹中潇^1,2, 冯仰德¹, 王珏¹, 闵维潇³, 姚铁锤^1,2, 高岳⁴, 王丽华³, 高付海⁴

1. 中国科学院计算机网络信息中心, 北京 100190;
2. 中国科学院大学, 北京 100049;
3. 北京航空航天大学软件学院, 北京 100191;
4. 中国原子能科学研究院, 北京 102413

收稿日期:2021-01-04 修回日期:2021-03-02 发布日期:2022-02-14
作者简介:曹中潇(1994-),女,硕士研究生,主研方向为并行计算;冯仰德、王珏(通信作者),副研究员、博士;闵维潇,硕士研究生;姚铁锤,博士研究生;高岳,工程师、硕士;王丽华,教授;高付海,副研究员、硕士。
基金资助:
国家重点研发计划（2017YFB0202302）。

Computing Performance Prediction Model for Sparse Matrix Vector Multiplication Based on Deep Learning

CAO Zhongxiao^1,2, FENG Yangde¹, WANG Jue¹, MIN Weixiao³, YAO Tiechui^1,2, GAO Yue⁴, WANG Lihua³, GAO Fuhai⁴

1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China;
2. University of Chinese Academy of Sciences, Beijing 100049, China;
3. School of Software, Beihang University, Beijing 100191, China;
4. China Institute of Atomic Energy, Beijing 102413, China

Received:2021-01-04 Revised:2021-03-02 Published:2022-02-14

摘要/Abstract

摘要： 稀疏矩阵向量乘（SpMV）是求解稀疏线性方程组的计算核心，被广泛应用在经济学模型、信号处理等科学计算和工程应用中，对于SpMV及其调优技术的研究有助于提升解决相关领域问题的运算效率。传统SpMV自动调优方法基于硬件平台的体系结构参数设置来提升SpMV性能，但巨大的参数设置量导致搜索空间变大且自动调优耗时大幅增加。采用深度学习技术，基于卷积神经网络，构建由双通道稀疏矩阵特征融合以及稀疏矩阵特征与体系结构特征融合组成的SpMV运算性能预测模型，实现快速自动调优。为提高SpMV运算时间的预测精度，选取特征数据并利用箱形图统计SpMV时间信息，同时在佛罗里达稀疏矩阵数据集上进行实验设计与验证，结果表明，该模型的SpMV运算时间预测准确率达到80%以上，并且具有较强的泛化能力。

关键词: 稀疏矩阵向量乘, 自动调优, 深度学习, 卷积神经网络, 特征融合

Abstract: Sparse Matrix Vector Multiplication(SpMV) is key to solving sparse linear equations.It is widely used in economic modeling, signal processing and other scientific and engineering tasks.The research on SpMV and its tuning technology can improve the computational efficiency of solving problems in related fields.Traditional SpMV automatic tuning methods improve the performance of SpMV based on the architecture parameter settings of the hardware platform, but the huge amount of parameter settings leads to a larger search space and a significant increase in the time consumption of automatic tuning.To implement fast and accurate automatic tuning, we use deep learning technology to construct a Convolutional Neural Network(CNN) model for SpMV computing performance prediction, which is built based on dual-channel sparse matrix feature fusion, sparse matrix feature fusion and architecture feature fusion.In order to improve the prediction accuracy of SpMV computing performance, feature data is selected and constructed.The box plot is used to count SpMV time information.Then the Florida sparse matrix dataset is selected for experimental design and verification.Experimental results show that the model displaying a prediction accuracy of SpMV computing time over 80% and strong generalization ability.

Key words: Sparse Matrix Vector Multiplication(SpMV), automatic tuning, deep learning, Convolutional Neural Network(CNN), feature fusion

中图分类号:

TP332

曹中潇, 冯仰德, 王珏, 闵维潇, 姚铁锤, 高岳, 王丽华, 高付海. 基于深度学习的稀疏矩阵向量乘运算性能预测模型[J]. 计算机工程, 2022, 48(2): 86-91.

CAO Zhongxiao, FENG Yangde, WANG Jue, MIN Weixiao, YAO Tiechui, GAO Yue, WANG Lihua, GAO Fuhai. Computing Performance Prediction Model for Sparse Matrix Vector Multiplication Based on Deep Learning[J]. Computer Engineering, 2022, 48(2): 86-91.

https://www.ecice06.com/CN/Y2022/V48/I2/86

图/表 9

20220228182507

20220228182511

20220228182514

20220228182518

20220228182522

20220228182526

20220228182529

20220228182533

20220228182536

参考文献

[1] 李亿渊, 薛巍, 陈德训, 等.稀疏矩阵向量乘法在申威众核架构上的性能优化[J].计算机学报, 2020, 43(6):1010-1024. LI Y Y, XUE W, CHEN D X, et al.Performance optimization for sparse matrix-vector multiplication on Sunway architecture[J].Chinese Journal of Computers, 2020, 43(6):1010-1024.(in Chinese)
[2] VUDUC R, DEMMEL J W, YELICK K A, et al.Performance optimizations and bounds for sparse matrix-vector multiply[C]//Proceedings of 2002 ACM/IEEE Conference on Supercomputing.Washington D.C., USA:IEEE Press, 2002:1-35.
[3] BELL N, GARLAND M.Implementing sparse matrix-vector multiplication on throughput-oriented processors[C]//Proceedings of 2009 ACM Conference on High Performance Computing Networking, Storage and Analysis.New York, USA:ACM Press, 2009:1-11.
[4] CHOI J W, SINGH A, VUDUC R W.Model-driven autotuning of sparse matrix-vector multiply on GPUs[C]//Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.New York, USA:ACM Press, 2010:115-126.
[5] KOURTIS K, KARAKASIS V, GOUMAS G, et al.CSX:an extended compression format for SpMV on shared memory systems[C]//Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming.New York, USA:ACM Press, 2011:247-256.
[6] LIU W F, VINTER B.CSR5:an efficient storage format for cross-platform sparse matrix-vector multiplication[C]//Proceedings of the 29th ACM on International Conference on Supercomputing.New York, USA:ACM Press, 2015:339-350.
[7] LIU W F, VINTER B.Speculative segmented sum for sparse matrix-vector multiplication on heterogeneous processors[J].Parallel Computing, 2015, 49:179-193.
[8] SU B Y, KEUTZER K.clSpMV:a cross-platform OpenCL SpMV framework on GPUs[C]//Proceedings of the 26th ACM international conference on Supercomputing.New York, USA:ACM Press, 2012:353-364.
[9] XIE B W, ZHAN J F, LIU X, et al.CVR:efficient vectorization of SpMV on X86 processors[C]//Proceedings of 2018 International Symposium on Code Generation and Optimization.New York, USA:ACM Press, 2018:149-162.
[10] YAN S G, LI C, ZHANG Y Q, et al.yaSpMV:yet another SpMV framework on GPUs[C]//Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.New York, USA:ACM Press, 2014:107-118.
[11] 袁娥, 张云泉, 刘芳芳, 等.SpMV的自动性能优化实现技术及其应用研究[J].计算机研究与发展, 2009, 46(7):1117-1126. YUAN E, ZHANG Y Q, LIU F F, et al.Automatic performance tuning of sparse matrix-vector multiplication:implementation techniques and its application research[J].Journal of Computer Research and Development, 2009, 46(7):1117-1126.(in Chinese)
[12] VUDUC R, DEMMEL J W, YELICK K A.OSKI:a library of automatically tuned sparse matrix kernels[J].Journal of Physics:Conference Series, 2005, 16:521-530.
[13] WILLIAMS S, OLIKER L, VUDUC R, et al.Optimization of sparse matrix-vector multiplication on emerging multicore platforms[J].Parallel Computing, 2009, 35(3):178-194.
[14] 李佳佳, 张秀霞, 谭光明, 等.选择稀疏矩阵乘法最优存储格式的研究[J].计算机研究与发展, 2014, 51(4):882-894. LI J J, ZHANG X X, TAN G M, et al.Study of choosing the optimal storage format of sparse matrix vector multiplication[J].Journal of Computer Research and Development, 2014, 51(4):882-894.(in Chinese)
[15] SEDAGHATI N, MU T, POUCHET L N, et al.Automatic selection of sparse matrix representation on GPUs[C]//Proceedings of the 29th ACM on International Conference on Supercomputing.New York, USA:ACM Press, 2015:99-108.
[16] BENATIA A, JI W X, WANG Y Z, et al.Sparse matrix format selection with multiclass SVM for SpMV on GPU[C]//Proceedings of the 45th International Conference on Parallel Processing.Washington D.C., USA:IEEE Press, 2016:496-505.
[17] NISA I, SIEGEL C, RAJAM A S, et al.Effective machine learning based format selection and performance modeling for SpMV on GPUs[C]//Proceedings of 2018 IEEE International Parallel and Distributed Processing Symposium Workshops.Washington D.C., USA:IEEE Press, 2018:1056-1065.
[18] ZHAO Y, ZHOU W J, SHEN X P, et al.Overhead-conscious format selection for SpMV-based applications[C]//Proceedings of 2018 IEEE International Parallel and Distributed Processing Symposium.Washington D.C., USA:IEEE Press, 2018:950-959.
[19] ZHOU W J, ZHAO Y, SHEN X P, et al.Enabling runtime SpMV format selection through an overhead conscious method[J].IEEE Transactions on Parallel and Distributed Systems, 2020, 31(1):80-93.
[20] ZHAO Y, LI J J, LIAO C H, et al.Bridging the gap between deep learning and sparse matrix format selection[C]//Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.New York, USA:ACM Press, 2018:94-108.
[21] IOFFE S, SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning.New York, USA:ACM Press, 2015:448-456.
[22] WILLIAMSON D F, PARKER R A, KENDRICK J S.The box plot:a simple visual method to interpret data[J].Annals of Internal Medicine, 1989, 110(11):916-921.
[23] DAVIS T A, HU Y F.The university of Florida sparse matrix collection[J].ACM Transactions on Mathematical Software, 2011, 38(1):1-25.
[24] BOISVERT R F, POZO R, REMINGTON K A.The matrix market exchange formats[EB/OL].[2020-11-15].https://www.researchgate.net/publication/213880672_The_Matrix_Market_Exchange_Format_Initial_Design.

选择文件类型/文献管理软件名称

选择包含的内容

基于深度学习的稀疏矩阵向量乘运算性能预测模型

Computing Performance Prediction Model for Sparse Matrix Vector Multiplication Based on Deep Learning

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	魏嵬, 丁香香, 郭梦星, 杨钊, 刘辉. 文本相似度计算方法综述[J]. 计算机工程, 2024, 50(9): 18-32.
[2]	王志浩, 钱沄涛. 基于Swin Transformer的双流遥感图像时空融合超分辨率重建[J]. 计算机工程, 2024, 50(9): 33-45.
[3]	李俊俊, 董建刚, 李坤. 基于Kubernetes的集群节能策略研究[J]. 计算机工程, 2024, 50(9): 82-91.
[4]	李俊仪, 李向阳, 龙朝勋, 李海燕, 李红松, 余鹏飞. 基于多级区域选择与跨层特征融合的野生菌分类[J]. 计算机工程, 2024, 50(9): 179-188.
[5]	张鲁, 田春伟, 宋焕生, 刘侍刚. 用于低剂量CT图像去噪的多级双树复小波网络[J]. 计算机工程, 2024, 50(9): 266-275.
[6]	朱凯, 李理, 张彤, 江晟, 别一鸣. 基于Transformer的多阶段运动模糊图像修复网络[J]. 计算机工程, 2024, 50(9): 276-285.
[7]	张天鹏, 韩晶, 吕学强. 基于多任务学习的超分辨率辅助小目标检测[J]. 计算机工程, 2024, 50(9): 304-312.
[8]	高煜宝, 文志诚. 基于注意力机制的双路解码器图像去噪方法[J]. 计算机工程, 2024, 50(9): 324-332.
[9]	张华青, 夏张涛, 陆晓庆, 童基均. 基于字形特征的血管外科命名实体识别[J]. 计算机工程, 2024, 50(8): 13-21.
[10]	李华昱, 张智康, 闫阳, 岳阳. 基于知识图谱增强的领域多模态实体识别[J]. 计算机工程, 2024, 50(8): 31-39.
[11]	王蕾, 党时鹏, 潘丰. 基于卷积神经网络的隐匿性旁路预测模型[J]. 计算机工程, 2024, 50(8): 40-49.
[12]	刘锁兰, 王炎, 王洪元, 朱生升. 基于多流语义图卷积网络的人体行为识别[J]. 计算机工程, 2024, 50(8): 64-74.
[13]	张亚洲, 和玉, 戎璐, 王祥凯. 基于上下文知识增强型Transformer网络的抑郁检测[J]. 计算机工程, 2024, 50(8): 75-85.
[14]	赵婉秋, 张俊虎, 李海涛. 用于建筑物分割的平行结构特征融合网络[J]. 计算机工程, 2024, 50(8): 239-248.
[15]	赵宏, 王枭. 基于Swin-Transformer的黑色素瘤图像病灶分割研究[J]. 计算机工程, 2024, 50(8): 249-258.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于深度学习的稀疏矩阵向量乘运算性能预测模型

Computing Performance Prediction Model for Sparse Matrix Vector Multiplication Based on Deep Learning

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献

相关文章 15

编辑推荐

Metrics

本文评价