作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

基于CUDA的热传导GPU并行算法研究

孟小华a,b,黄丛珊a,朱丽莎a,b   

  1. (暨南大学 a. 计算机科学系;b. 天体测量、动力学与空间科学中法联合实验室,广州 510632)
  • 收稿日期:2013-02-26 出版日期:2014-05-15 发布日期:2014-05-14
  • 作者简介:孟小华(1965-),男,副教授、硕士,主研方向:并行分布式系统;黄丛珊,硕士研究生;朱丽莎,硕士。
  • 基金资助:
    国家自然科学基金资助项目(61073064)。

Research on GPU Parallel Algorithm of Heat Conduction Based on CUDA

MENG Xiao-hua  a,b, HUANG Cong-shan  a, ZHU Li-sha  a,b   

  1. (a. Department of Computer Science; b. Sino-France Joint Laboratory for Astrometry, Dynamics and Space Science, Jinan University, Guangzhou 510632, China)
  • Received:2013-02-26 Online:2014-05-15 Published:2014-05-14

摘要: 在热传导算法中,使用传统的CPU串行算法或MPI并行算法处理大批量粒子时,存在执行效率低、处理时间长的问题。而图形处理单元(GPU)具有大数据量并行运算的优势,为此,在统一计算设备架构(CUDA)并行编程环境下,采用CPU和GPU协同合作的模式,提出并实现一个基于CUDA的热传导GPU并行算法。根据GPU硬件配置设定Block和Grid的大小,将粒子划分为若干个block,粒子输入到GPU显卡中并行计算,每一个线程执行一个粒子计算,并将结果传回CPU主存,由CPU计算出每个粒子的平均热流。实验结果表明,与CPU串行算法在时间效率方面进行对比,该算法在粒子数到达16 000时,加速比提高近900倍,并且加速比随着粒子数的增加而加速提高。

关键词: 热传导算法, 图形处理单元, 统一计算设备架构, 并行计算, 时间效率, 加速比

Abstract: For real applications processing large volume of particles in one-dimensional heat conduction problem, the response time of CPU serial algorithm and MPI parallel algorithm is too long. Considering Graphic Processing Unit(GPU) offers powerful parallel processing capabilities, it implements a GPU parallel heat conduction algorithm on Compute Unified Device Architecture(CUDA) parallel programming environment using CPU and GPU collaborative mode. The algorithm sets the block and grid size based on GPU hardware configuration. Particles are divided into a plurality of blocks, the particle is into the GPU graphics for parallel computing, and one thread performs a calculation of a particle. It retrieves the processed data to CPU main memory and calculates the average heat flow of each particle. Experimental results show that, compared with CPU serial algorithm, GPU parallel algorithm has a great advantage in time efficiency, the speedup is close to 900, and speedup can improve as the particle number size increases.

Key words: heat conduction algorithm, Graphic Processing Unit(GPU), Compute Unified Device Architecture(CUDA), parallel computing, time efficiency, speedup ratio

中图分类号: