计算机工程 ›› 2019, Vol. 45 ›› Issue (2): 296-302.doi: 10.19678/j.issn.1000-3428.0049690

• 开发研究与工程应用 • 上一篇    下一篇

蛋白质系统发育分析并行计算方法研究

李易禅,凌诚   

  1. 北京化工大学 信息科学与技术学院,北京 100029
  • 收稿日期:2017-12-13 出版日期:2019-02-15 发布日期:2019-02-15
  • 作者简介:李易禅(1993—),女,硕士研究生,主研方向为生物信息学、并行计算;凌诚,讲师、博士。
  • 基金项目:

    国家自然科学基金(61602026)。

Research on Parallel Computation Method for Protein Phylogenetic Analysis

LI Yichan,LING Cheng   

  1. Collage of Information Science and Technology,Beijing University of Chemical Technology,Beijing 100029,China
  • Received:2017-12-13 Online:2019-02-15 Published:2019-02-15

摘要:

在目前系统发育学研究中,多数系统发育分析工具不能在GPU架构上分析蛋白质序列。为此,提出一种大规模系统发育分析方法tgpMC3。以添加虚字符的形式重新构造条件似然概率矩阵,降低由于多线程分支发散导致的时间消耗。设计粒度适中的半任务间并行策略,增加流多处理器上活跃的线程块数量。通过简单的键值对应方法传输含有模糊状态的转移概率矩阵,实现数据访问速度的提升。实验结果表明,与MrBayes v3.1.2串行版本方法相比,该方法最高可实现117的加速比,与taMC3方法相比,该方法的并行分析性能更好。

关键词: 系统发育分析, 条件似然概率, CUDA编程, 并行计算, MC3算法

Abstract:

In current phylogeny studies,most phylogenetic analysis tools cannot analyze protein sequences on the GPU architecture.Therefore,a large scale phylogenetic analysis method tgpMC3 is proposed.The Conditional Likelihood Probabilities(CLPs) matrix is reconstructed in the form of adding virtual characters to reduce the time consumption caused by the divergence of multi-thread branches.Parallel strategy between half tasks with moderate granularity is designed to increase the number of active thread blocks on Stream Multiprocessor (SM).The transfer probability matrix with fuzzy state is transmitted by a simple key-value correspondence method to improve the speed of data access.Experimental results show that compared with MrBayes v3.1.2 serial version method,this method can achieve a maximum speedup of 117,Compared with taMC3 method,the parallel analysis performance of this method is better.

Key words: phylogenetic analysis, Conditional Likelihood Probabilities(CLPs), CUDA programming, parallel computation, MC3 algorithm

中图分类号: