摘要: 在异构计算平台的移植和优化过程中,数字图像处理算法的访存性能已成为制约系统性能的主要因素。为此,结合NVIDIA Tegra K1硬件架构特征和具体算法特性,从合并与向量化访存优化、全局访存bank和channel冲突消除等方面,对矩阵转置算法和拉普拉斯滤波算法在NVIDIA Tegra K1异构计算平台上的实现和访存性能优化进行研究。实验结果表明,采用优化方法后的矩阵转置算法和拉普拉斯滤波算法在NVIDIA Tegra K1异构计算平台上取得了较大的访存性能提升,并且具有较好的实时性。
关键词:
GPU优化,
访存带宽,
数据本地化,
向量化,
合并访问,
拉普拉斯滤波算法
Abstract: During the transplantation and optimization of the heterogeneous computing platform,memory access performance of digital image data algorithm becomes the main factor.In order to solve the problem,this paper combines with the NVIDIA Tegra K1 hardware architecture’s characteristics and the specific algorithm’s characteristics,reserches the implementation and memory access performance optimization of matrix transpose and Laplace filtering algorithms on the NVIDIA Tegra K1 heterogeneous computing platform from memory access optimization of consolidation and vectorization,eliminating global memory access’s bank and channel conflict etc.Experimental result shows that the performance of matrix transpose and Laplace filtering algorithms on the NVIDIA Tegra K1 heterogeneous computing platform has an obvious improvement,and has good real-time performance.
Key words:
GPU optimization,
memory access bandwidth,
data localization,
vectorization,
coalesced access,
Laplace filtering algorithm
中图分类号:
梁军,李威,肖琳,徐歆恺. NVIDIA Tegra K1异构计算平台访存优化研究[J]. 计算机工程.
LIANG Jun,LI Wei,XIAO Lin,XU Xinkai. Research on Memory Access Optimization of NVIDIA Tegra K1 Heterogeneous Computing Platform[J]. Computer Engineering.