作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (18): 262-264. doi: 10.3969/j.issn.1000-3428.2012.18.071

• 开发研究与设计技术 • 上一篇    下一篇

一种Cholesky分解重叠算法

张德好,刘青昆   

  1. (辽宁师范大学计算机与信息技术学院,辽宁 大连 116086)
  • 收稿日期:2011-11-14 修回日期:2011-12-23 出版日期:2012-09-20 发布日期:2012-09-18
  • 作者简介:张德好(1981-),男,硕士,主研方向:分布式系统,并行计算;刘青昆,副教授

A Cholesky Decomposition Overlapped Algorithm

ZHANG De-hao, LIU Qing-kun   

  1. (School of Computer and Information Technique, Liaoning Normal University, Dalian 116086, China)
  • Received:2011-11-14 Revised:2011-12-23 Online:2012-09-20 Published:2012-09-18

摘要: 在图形处理单元(GPU)平台的计算中,GPU设备存储器和内存容量相差较大,待处理数据通常无法一次性从内存拷贝至显存中进行运算。为此,提出一种Cholesky分解重叠算法。采用预存取技术,拷贝数据和计算重叠,降低设备的等待时间,将设备存储器划分为 2个缓冲区,轮流存放本次运算数据和下次待运算数据,在设备运算过程中完成设备存储器和内存之间的数据交换。实验结果表明,该算法可以有效提高运算效率。

关键词: 图形处理单元, 预存取, 重叠算法, 通用计算, Cholesky分解, 集群系统

Abstract: In the computation of Graphics Processing Unit(GPU) platform, GPU equipment storage and memory capacity is different. Processed data usually cannot finish operation from memory copy to the video memory in one-time. In order to solve this problem, this paper proposes a Cholesky decomposition overlapped algorithm. By dividing the device storage into two buffers, current data and next data for calculation are stored in turn, data swap between device storage and memory takes place in the process of computation. Experimental results show that the algorithm can increase the system efficiency.

Key words: Graphics Processing Unit(GPU), prefetching, overlapped algorithm, general purpose computation, Cholesky decomposition, cluster system

中图分类号: