摘要: 在热传导算法中,使用传统的CPU串行算法或MPI并行算法处理大批量粒子时,存在执行效率低、处理时间长的问题。而图形处理单元(GPU)具有大数据量并行运算的优势,为此,在统一计算设备架构(CUDA)并行编程环境下,采用CPU和GPU协同合作的模式,提出并实现一个基于CUDA的热传导GPU并行算法。根据GPU硬件配置设定Block和Grid的大小,将粒子划分为若干个block,粒子输入到GPU显卡中并行计算,每一个线程执行一个粒子计算,并将结果传回CPU主存,由CPU计算出每个粒子的平均热流。实验结果表明,与CPU串行算法在时间效率方面进行对比,该算法在粒子数到达16 000时,加速比提高近900倍,并且加速比随着粒子数的增加而加速提高。
关键词:
热传导算法,
图形处理单元,
统一计算设备架构,
并行计算,
时间效率,
加速比
Abstract: For real applications processing large volume of particles in one-dimensional heat conduction problem, the response time of CPU serial algorithm and MPI parallel algorithm is too long. Considering Graphic Processing Unit(GPU) offers powerful parallel processing capabilities, it implements a GPU parallel heat conduction algorithm on Compute Unified Device Architecture(CUDA) parallel programming environment using CPU and GPU collaborative mode. The algorithm sets the block and grid size based on GPU hardware configuration. Particles are divided into a plurality of blocks, the particle is into the GPU graphics for parallel computing, and one thread performs a calculation of a particle. It retrieves the processed data to CPU main memory and calculates the average heat flow of each particle. Experimental results show that, compared with CPU serial algorithm, GPU parallel algorithm has a great advantage in time efficiency, the speedup is close to 900, and speedup can improve as the particle number size increases.
Key words:
heat conduction algorithm,
Graphic Processing Unit(GPU),
Compute Unified Device Architecture(CUDA),
parallel computing,
time efficiency,
speedup ratio
中图分类号: