作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (5): 180-182. doi: 10.3969/j.issn.1000-3428.2009.05.062

• 人工智能及识别技术 • 上一篇    下一篇

基于序列模式特征和SVM的剪切位点预测

孙贺全,彭勤科,张全伟   

  1. (西安交通大学电信学院机械制造系统工程国家重点实验室,西安 710049)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-03-05 发布日期:2009-03-05

Splice Site Prediction Based on Characteristics of Sequence Motif and Support Vector Machine

SUN He-quan, PENG Qin-ke, ZHANG Quan-wei   

  1. (State Key Laboratory for Manufacturing Systems Engineering, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-03-05 Published:2009-03-05

摘要: 通过对HS3D数据集供点序列碱基的统计分析,利用供体位点邻域碱基出现规律构造模式(motif)作为DNA序列的属性。设置序列属性值将字符序列映射成数字向量,应用支撑向量机进行实验,实现对供体位点的预测分类。实验结果表明,与改进的motif得分模型方法相比,该文方法可有效去除数据中异常数据对分类的影响,将DNA字符序列变换到motif属性数字序列空间具有有效性和实用性。

关键词: 序列模式, 剪切位点, 支撑向量机

Abstract: Through statistic analysis on the donor site sequences in the dataset of HS3D, the rules that the bases appear in the adjacent sites around the splice sites are used for constructing motifs, which are then utilized as the attributes of the DNA sequences. And by setting the value of each attribute the literal sequences are transformed into numeric vectors, based on which a Support Vector Machine(SVM) model is constructed to predict splice sites. The experimental results indicate that compared with the improved motif scoring model, the proposed method has diminished the influence on the prediction generated by the abnormal data effectively and also shows that the new mapping method in virtue of motifs is practicable and effectual.

Key words: sequence motif, splice site, Support Vector Machine(SVM)

中图分类号: