作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 开发研究与工程应用 • 上一篇    下一篇

序列比对算法中的BW变换索引技术研究及其改进

赵雅男1,2,徐云1,2,程昊宇1,2   

  1. (1.中国科学技术大学计算机科学与技术学院,合肥 230027; 2.安徽省高性能计算重点实验室,合肥 230027)
  • 收稿日期:2014-12-31 出版日期:2016-01-15 发布日期:2015-01-15
  • 作者简介:赵雅男(1992-),女,硕士研究生,主研方向为生物信息学;徐云,教授、博士生导师;程昊宇,硕士研究生。
  • 基金资助:
    国家自然科学基金资助重点项目(61033009);国家“111”计划基金资助项目(B07033)。

Research on BW Transform Index Technology in Sequence Alignment Algorithm and Its Improvement

ZHAO Yanan 1,2,XU Yun 1,2,CHENG Haoyu 1,2   

  1. (1.School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027,China;2.Key Laboratory of High Performance Computing of Anhui Province,Hefei 230027,China)
  • Received:2014-12-31 Online:2016-01-15 Published:2015-01-15

摘要: 面向大规模长序列的序列比对问题是生物信息学中最重要的基础问题之一。针对序列比对算法的主流索引技术BW变换(BWT)进行研究,提出一种新的二阶BWT索引方法。与传统BWT方法的逐位索引查找不同,改进后的BWT方法按双位索引查找。实验结果表明,改进后的方法减 少了序列比对算法中的循环遍历和计算次数,降低了序列比对算法中索引方法的复杂度,提高了查找效率,尤其适合长序列和大规模序列的索引和查找。

关键词: 序列比对, 索引, BW变换索引, 第二代测序, 第三代测序, 大规模长序列比对

Abstract: Sequence alignment of large-scale and long sequences is one of the most important and basic issues in bioinformatics.This paper focuses on Burrows-Wheeler Transform(BWT) which is the major index technology in sequence alignment algorithms and proposes a new second-order BWT index concept as well as its implementation.Different from the traditional BWT algorithm while searching with a single character,the algorithm can find two characters at one time.Experimental results show that the second-order BWT index algorithm can reduce the frequency of loop and calculation in sequence alignment algorithm.It can also reduce the alignment algorithm complexity by half and improve the search efficiency,especially for large-scale and long sequence’s index and searching process.

Key words: sequence alignment, index, Burrows-Wheeler Transform(BWT) index, next-generation sequencing, third-generation sequencing, alignment of large-scale and long sequences

中图分类号: