作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (8): 208-210. doi: 10.3969/j.issn.1000-3428.2009.08.070

• 人工智能及识别技术 • 上一篇    下一篇

一种新的蛋白质序列模式挖掘算法

郭 顺,姜青山,王备战,史 亮   

  1. (厦门大学软件学院,厦门 361005)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-04-20 发布日期:2009-04-20

Mining Algorithm for Protein Sequence Pattern

GUO Shun, JIANG Qing-shan, WANG Bei-zhan, SHI Liang   

  1. (Software School, Xiamen University, Xiamen 361005)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-04-20 Published:2009-04-20

摘要: 针对传统模式挖掘方法挖掘蛋白质序列会生成大量候选模式或多次构造投影数据库,导致效率降低,挖掘过程中会产生不必要的短模式或错误模式等问题,提出基于模式划分的MBioPM算法。理论分析和实验表明,MBioPM算法的性能高于其他相关算法。

关键词: 蛋白质序列, 模式挖掘, 数据挖掘, 生物信息学

Abstract: Traditional algorithms face efficiency problem because of generating a huge number of candidates or constructing projected database many times. These algorithms will generate unnecessary short patterns or even wrong patterns in the process of mining. To attack these problems, a novel mining algorithm called Motif-divide based Biology sequence Pattern Mining(MBioPM) is presented based on “motif-divide” method. The MBioPM algorithm improves the efficience and avoids the problems mentioned above. Theoretical analysis and experimental results show that MBioPM algorithm improves performance as compared with other algorithm.

Key words: protein sequence, pattern mining, data mining, bioinformatics

中图分类号: