作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (12): 16-23. doi: 10.19678/j.issn.1000-3428.0063914

• 先进计算技术 • 上一篇    下一篇

面向PMVS算法的自动两级并行翻译方法

刘金硕1, 黄朔1, 邓娟2   

  1. 1. 武汉大学 国家网络安全学院 空天信息安全与可信计算教育部重点实验室, 武汉 430072;
    2. 武汉大学 计算机学院, 武汉 430072
  • 收稿日期:2022-02-12 修回日期:2022-03-15 发布日期:2022-07-05
  • 作者简介:刘金硕(1973—),女,教授、博士,主研方向为高性能计算;黄朔,硕士研究生;邓娟,副教授、博士。
  • 基金资助:
    国家自然科学基金(61672393,U1936107)。

Automatic Two-level Parallel Translation Method for PMVS Algorithm

LIU Jinshuo1, HUANG Shuo1, DENG Juan2   

  1. 1. Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China;
    2. School of Computer Science, Wuhan University, Wuhan 430072, China
  • Received:2022-02-12 Revised:2022-03-15 Published:2022-07-05

摘要: 当使用高分辨率的图像作为图像处理算法的输入时会降低算法运行速度,将算法并行化可提升执行效率,但手动将串行程序转换为并行程序则较为繁琐,并且现有自动并行翻译工具性能不稳定,同时翻译后的程序是单一并行模式。面向基于面片的三维多视角立体视觉(PMVS)算法,提出一种从C到CUDA的自动两级并行翻译方法。使用ANTLR自动解析源C代码,通过分析数据依赖关系和循环数组私有化来识别可并行化的循环结构,将算法翻译成CPU多线程和GPU两级并行结构的代码。在算法执行过程中,将输入图像在CPU和GPU上分别进行处理,降低了算法总执行时间。实验结果表明,该方法的计算加速比随着输入图像分辨率的增加逐渐提高,最高约达到32,相比于PPCG和OpenACC自动并行翻译方法提升明显。

关键词: 两级并行翻译, 图像处理算法, 基于面片的三维多视角立体视觉, 扩展Backus-Naur范式, 抽象语法树

Abstract: Currently, the calculation speed of the image processing algorithm is very slow when high-resolution images are used as input data.Although parallelizing the algorithm can improve its execution efficiency, the manual conversion of serial programs to parallel programs is tedious.Moreover, current automatic parallel translation tools are not scalable, and the translated program is in single parallel mode.To solve this problem, this study proposes an automatic two-level parallel translation method from C to CUDA for the Patch-based Multiple View Stereo(PMVS) algorithm, using Another Tool for Language Recognition(ANTLR) to automatically parse the source C code and identify the parallelizable loop structures by analyzing data dependencies and loop array privatization.Additionally, the loop structure of the algorithm is translated into a two-level parallel structure that includes CPU multithreading and the GPU.When the algorithm is executed, the input image is divided into two parts:one part is processed by the CPU's multithreaded code, and the other part is processed by the GPU code, thereby reducing the total execution time of the algorithm.The experimental results show that an increase in the input image resolutions gradually improves the performance of the proposed method, and the maximum speedup ratio can reach approximately 32.Moreover, the proposed method has a significantly higher speed compared with the automatic Polyhedral Parallel Code Generation(PPCG) and OpenACC translation methods.

Key words: two-level parallel translation, image processing algorithm, Patch-based Multiple View Stereo(PMVS), Extended Backus-Naur Form(EBNF), Abstract Syntax Tree(AST)

中图分类号: