计算机工程 ›› 2020, Vol. 46 ›› Issue (12): 52-59.doi: 10.19678/j.issn.1000-3428.0056545

• 人工智能与模式识别 • 上一篇    下一篇

改进的动态PPI网络构建与蛋白质功能预测算法

李鹏1,2,3, 闵慧4, 罗爱静1,3, 瞿昊宇2, 伊娜2, 许家祺2   

  1. 1. 中南大学湘雅三医院, 长沙 410006;
    2. 湖南中医药大学 信息科学与工程学院, 长沙 410208;
    3. 医学信息研究湖南省普通高等学校重点实验室(中南大学), 长沙 410006;
    4. 湖南信息职业技术学院 软件学院, 长沙 410200
  • 收稿日期:2019-11-08 修回日期:2019-12-17 发布日期:2019-12-24
  • 作者简介:李鹏(1983-),男,讲师、博士、博士后,主研方向为生物信息学、机器学习、中医药大数据;闵慧(通信作者),讲师、硕士;罗爱静,教授、博士生导师;瞿昊宇,实验师、硕士;伊娜、许家祺,本科生。
  • 基金项目:
    国家重点研发计划(2017YFC1703306);国家社会科学基金重点项目(17AZD037);湖南省自然科学基金青年项目(2019JJ50453);湖南省自然科学基金面上项目(2018JJ2301);湖南省科技厅重点项目(2018JJ2301);湖南中医药大学开放基金(2018JK02)。

Improved Dynamic PPI Network Construction and Protein Function Prediction Algorithm

LI Peng1,2,3, MIN Hui4, LUO Aijing1,3, QU Haoyu2, YI Na2, XU Jiaqi2   

  1. 1. The Third Xiangya Hospital of Central South University, Changsha 410006, China;
    2. School of Information Science and Engineering, Hunan University of Chinese Medicine, Changsha 410208, China;
    3. Key Laboratory of Medical Information Research of College of Hunan Province(Central South University), Changsha 410006, China;
    4. School of Software, Hunan College of Information, Changsha 410200, China
  • Received:2019-11-08 Revised:2019-12-17 Published:2019-12-24

摘要: 构建可靠的动态蛋白质网络是提高蛋白质未知功能预测和蛋白质复合物识别性能的关键,然而现有蛋白质网络构建和功能预测方法普遍存在鲁棒性低、预测精度不足等问题。为此,设计改进的动态蛋白质网络构建算法。采用进化图对蛋白质相互作用进行建模,基于蛋白质的活性周期将整个蛋白质网络划分为多个时间片的动态子网,在各个子网内部依据蛋白质之间的连接强度确定相互作用关系,从而得到一个全局的动态蛋白质网络。在此基础上,通过考查未知功能蛋白质邻居节点功能注释情况的差异,提出基于功能关联得分或神经网络的功能预测算法IPA-PF。在多个公开生物数据集上的实验结果表明,IPA-PF算法的查全率、查准率和F-measure指标优于HPMM、D-PIN、EFM和FP-BMD算法,且对输入参数不敏感,在保证功能预测准确性的前提下,其时间复杂度处于合理范围内。

关键词: 动态蛋白质网络, 进化图, 连接强度, 功能预测, 神经网络

Abstract: How to construct a reliable dynamic protein network is one of the key problems that affect the prediction of unknown protein functions or the recognition of protein complexes.However,the existing protein network construction methods and function prediction methods generally have low robustness and low prediction accuracy.Therefore,this paper proposes an improved dynamic protein network construction algorithm.In this paper,protein-protein interactions are modeled based on the evolutionary graph,and then the whole protein network is divided into dynamic subnets of multiple time slices based on the active cycle of protein.The relationship of protein-protein interactions among the subnets are determined according to the connection strength between proteins,so the global dynamic protein network is obtained.On this basis,a function prediction algorithm,IPA-PF,based on the function correlation score or neural network is proposed by examining the differences of function annotation between neighbor nodes of unknown functional proteins.The experimental results on several open biological datasets show that the proposed algorithm outperforms the HPMM,D-PIN,EFM and FP-BMD algorithms in terms of the recall rate,precision and F-measure,and it is insensitive to input parameters.On the premise of ensuring the accuracy of function prediction,the time complexity of the proposed algorithm is within a reasonable range.

Key words: dynamic protein network, evolutionary graph, connection strength, function prediction, neural network

中图分类号: