摘要: 随着实际应用中图像数据规模的增大和分辨率的提高,图像边缘检测算法的性能成为制约图像实时处理的关键。从向量化访存、数据本地化以及条件分支优化3个方面出发,结合算法特性和底层硬件架构特征,研究Canny边缘检测算法在NVIDIA Tegra K1异构计算平台上的GPU性能优化。实验结果表明,与基于OpenCV3.0 CPU的Canny边缘检测算法相比,优化后的Canny边缘检测算法在不同图像数据规模下可达13.2倍~17.8倍的性能加速比,具有较好的检测性能。
关键词:
图像边缘检测,
异构计算平台,
向量化访存,
数据本地化,
条件分支优化
Abstract: With the increase of the size of the image data and the improvement of the image resolution,the performance of the image edge detection algorithm becomes the key to the real-time processing of the image.Based on the three aspects of quantitative acess memory,data localization and conditional branch optimization,this paper studies the GPU performance optimization of Canny edge detection algorithm on NVIDIA Tegra K1 heterogeneous computing platform combined with algorithm characteristics and underlying hardware architecture characteristics.The experimental results show that compared with the Canny edge detection algorithm based on OpenCV3.0 CPU,the optimized Canny edge detection algorithm achieves 13.2 times to 17.8 times performance acceleration ratio with different graphic data size,and has better detection performance.
Key words:
image edge detection,
heterogeneous computing platform,
quantitative acess memory,
data localization,
conditional branch optimization
中图分类号:
魏秋明,梁军,鲍泓,王晶,李论. 异构计算平台图像边缘检测算法优化研究[J]. 计算机工程.
WEI Qiuming,LIANG Jun,BAO Hong,WANG Jing,LI Lun. Research on Image Edge Detection Algorithm Optimization on Heterogeneous Computing Platform[J]. Computer Engineering.