作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2026, Vol. 52 ›› Issue (3): 332-345. doi: 10.19678/j.issn.1000-3428.0069807

• 高性能计算与大数据 • 上一篇    下一篇

面向GPU的稀疏对角矩阵自适应SpMV优化方法

王宇华1,2, 何俊飞1, 张宇琪1, 兰海燕1,*(), 曹林琳1   

  1. 1. 哈尔滨工程大学计算机学院, 黑龙江 哈尔滨 150001
    2. 电子政务建模仿真国家工程实验室, 黑龙江 哈尔滨 150001
  • 收稿日期:2024-04-29 修回日期:2024-08-19 出版日期:2026-03-15 发布日期:2024-10-28
  • 通讯作者: 兰海燕
  • 作者简介:

    王宇华(CCF高级会员),男,副教授、博士,主研方向为并行计算、人工智能

    何俊飞,硕士研究生

    张宇琪,硕士研究生

    兰海燕(通信作者),讲师、博士

    曹林琳,硕士研究生

  • 基金资助:
    国家自然科学基金(62072135)

Sparse Diagonal Matrix Adaptive SpMV Optimization Method for GPU

WANG Yuhua1,2, HE Junfei1, ZHANG Yuqi1, LAN Haiyan1,*(), CAO Linlin1   

  1. 1. School of Computer Science, Harbin Engineering University, Harbin 150001, Heilongjiang, China
    2. National Engineering Laboratory of Modeling and Emulation in E-Government, Harbin 150001, Heilongjiang, China
  • Received:2024-04-29 Revised:2024-08-19 Online:2026-03-15 Published:2024-10-28
  • Contact: LAN Haiyan

摘要:

稀疏矩阵向量乘(SpMV)是稀疏线性系统的计算核心和瓶颈, 其运算效率会影响迭代求解器的整体性能, 其优化研究一直是科学计算和工程应用领域中的研究热点之一。偏微分方程的离散化会产生稀疏对角矩阵, 由于其多样的非零元分布, 导致没有一种方法能够在所有矩阵中取得最优时间性能。针对上述问题, 提出一种面向图形处理单元(GPU)的稀疏对角矩阵自适应SpMV优化方法AST(Adaptive SpMV Tuning)。该方法通过设计特征空间, 构建特征提取器, 提取矩阵结构精细特征, 通过深入分析特征和SpMV方法的相关性, 建立可扩展的候选方法集合, 形成特征和最优方法的映射关系, 构建性能预测工具, 实现矩阵最优方法的高效预测。实验结果表明, AST能够取得85.8%的预测准确率, 平均时间性能损失为0.09, 相比于DIA(Diagonal)、HDIA(Hacked DIA)、HDC(Hybrid of DIA and Compressed Sparse Row)、DIA-Adaptive和DRM(Divide-Rearrange and Merge), 能够获得平均20.19、1.86、3.06、3.72和1.53倍的内核运行时间加速和1.05、1.28、12.45、1.94和0.97倍的浮点运算性能加速。

关键词: 稀疏矩阵向量乘, 稀疏对角矩阵, 图形处理单元, 自适应优化方法, 矩阵结构特征

Abstract:

Sparse Matrix-Vector multiplication (SpMV) is the computational core and bottleneck of sparse linear systems, and its computational efficiency affects the overall performance of iterative solvers. Its optimization has long been a research hotspot in the fields of scientific computing and engineering applications. The discretization of partial differential equations produces sparse diagonal matrices, and because of their diverse distributions of nonzero elements, no single method can achieve optimal time performance across all matrices. To solve these problems, a Graphics Processing Unit (GPU)-based sparse diagonal matrix adaptive SpMV optimization method called Adaptive SpMV Tuning (AST) is proposed. This method designs a feature space and constructs a feature extractor to extract fine-grained features of the matrix structure. By analyzing the correlation between these features and SpMV methods, it establishes a scalable set of candidate methods and forms a mapping relationship between the features and optimal methods. Subsequently, a performance prediction tool is built to efficiently predict the optimal method for the matrix. The experimental results show that AST can achieve a prediction accuracy of 85.8%, with an average time performance loss of 0.09. Compared to Diagonal (DIA), Hacked DIA (HDIA), Hybrid of DIA and Compressed Sparse Row (HDC), DIA-Adaptive, and Divide-Rearrange and Merge (DRM), AST can achieve an average speedup in kernel runtime of 20.19, 1.86, 3.06, 3.72, and 1.53 times, respectively, and a speedup in floating-point performance of 1.05, 1.28, 12.45, 1.94, and 0.97 times, respectively.

Key words: Sparse Matrix-Vector multiplication (SpMV), sparse diagonal matrix, Graphics Processing Unit (GPU), adaptive optimization method, matrix structural feature