计算机工程 ›› 2009, Vol. 35 ›› Issue (19): 192-194,.doi: 10.3969/j.issn.1000-3428.2009.19.064

• 人工智能及识别技术 • 上一篇    下一篇

基于遗传算法的蛋白质质谱数据特征选择

李义峰,刘毅慧   

  1. (山东轻工业学院信息科学与技术学院,济南 250353)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-10-05 发布日期:2009-10-05

Feature Selection for Protein Mass Spectrometry Data Based on Genetic Algorithm

LI Yi-feng, LIU Yi-hui   

  1. (School of Information Science and Technology, Shandong Institute of Light Industry, Jinan 250353)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-10-05 Published:2009-10-05

摘要: 针对蛋白质质谱数据在降维、分类及生物标记物识别过程中存在的问题,提出一种基于遗传算法的特征选择方法,介绍几种常用的相关策略,包括基于排列和精英保留的随机通用采样选择策略和基于自适应变异率的均匀变异策略,给出2个适应度函数——封装器函数与多变元筛选器函数,将它们引入遗传算法中,并进行性能测试与比较。实验结果表明,基于封装器的遗传算法性能优于其他特征选择算法,而基于多变元筛选器的遗传算法性能优于单变元筛选器算法。

关键词: 质谱, 遗传算法, 特征选择

Abstract: Aiming at the problem in process of dimensionality reduction, classification and living creature label recognition for protein mass spectrometry data, a feature selection method based on Genetic Algorithm(GA) is proposed. Some usual relevant strategies are introduced, including elitism coupled with rank based stochastic universal sampling selection strategy and uniform mutation with adaptive mutation rate strategy. On this basis, two fitness functions are put forward, which are wrapper and multivariate sizer. They are introduced into GA, and the performances are tested and compared. Experimental results show that the performance of the wrapper-based GA outperforms all the other feature selection methods, and the one of multivariate filter-based GA is better than univariate one.

Key words: mass spectrometry, Genetic Algorithm(GA), feature selection

中图分类号: