作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (18): 203-204. doi: 10.3969/j.issn.1000-3428.2010.18.070

• 人工智能及识别技术 • 上一篇    下一篇

基于最大熵模型的蛋白质作用位点识别方法

杜秀全,程家兴,宋 杰   

  1. (安徽大学计算智能与信号处理教育部重点实验室,合肥 230039)
  • 出版日期:2010-09-20 发布日期:2010-09-30
  • 作者简介:杜秀全(1982-),男,博士研究生,主研方向:智能计算,生物计算,机器学习;程家兴,教授、博士;宋 杰,副教授、博士
  • 基金资助:

    教育部博士点基金资助项目(200403057002);安徽大学研究生创新基金资助项目(20073056);安徽省教育厅自然科学基金资助项目(KJ2007B239)

Recognition Method of Protein Interaction Sites Based on Maximum Entropy Model

DU Xiu-quan, CHENG Jia-xing, SONG Jie   

  1. (Key Laboratory of Intelligent Computing & Signal Processing, Ministry of Education, Anhui University, Hefei 230039, China)
  • Online:2010-09-20 Published:2010-09-30

摘要:

蛋白质相互作用位点的预测是当前生物信息学的一个研究热点。针对蛋白质序列中对界面残基有影响的各种因素,提出将蛋白质的进化信息和保守性作为特征函数,此类信息体现了蛋白质序列中氨基酸之间短程和长程相互作用的影响。采用最大熵模型作为蛋白质作用位点识别的分类器,将多源信息融合成一个概率模型。实验结果表明该方法与其他传统机器学习方法相比,在特异度和精度上分别提高了2%~8%、3%~11%,且获得了较高的相关系数。

关键词: 蛋白质作用位点, 最大熵, 序列谱, 残基保守性, 机器学习

Abstract:

Prediction of protein-protein interaction sites is a hotspot in current bioinformatics. This paper conducts evolutionary information and conserved score as feature functions based on the influential factors which are crucial to the states of protein interface residues. These kinds of information materialize the influence of short threads and long threads between ammonias of protein sequences. Maximum entropy model is used as the classifier for protein-protein interaction sites. Multi-source information is integrated into a single probability model. Experimental results show that compared with other machine learning methods, this method gets higher specificity by 2%~8% and higher accuracy rate by 3%~11% with higher correlation coefficient.

Key words: protein interaction sites, maximum entropy, sequence profile, residue conserved score, machine learning

中图分类号: