计算机工程 ›› 2010, Vol. 36 ›› Issue (06): 149-151.doi: 10.3969/j.issn.1000-3428.2010.06.050

• 安全技术 • 上一篇    下一篇

基于加权信息增益的恶意代码检测方法

张小康,帅建梅,史 林   

  1. (中国科学技术大学自动化系,合肥 230027)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-03-20 发布日期:2010-03-20

Malicious Code Detection Method Based on Weighted Information Gain

ZHANG Xiao-kang, SHUAI Jian-mei, SHI Lin   

  1. (Department of Automation, University of Science & Technology of China, Hefei 230027)
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-03-20 Published:2010-03-20

摘要: 采用数据挖掘技术检测恶意代码,提出一种基于加权信息增益的特征选择方法。该方法综合考虑特征频率和信息增益的作用,能够更加准确地选取有效特征,从而提高检测性能。实现一个恶意代码检测系统,采用二进制代码的N-gram和变长N-gram作为特征提取方法,加权信息增益作为特征选择方法,使用多种分类器进行恶意代码检测。实验结果证明,该方法能有效提高恶意代码的检测率和准确率。

关键词: 数据挖掘, 变长N-gram, 特征选择, 信息增益

Abstract: Using data mining technology to detect malicious code, this paper proposes a feature selection method based on weighted information gain. This method can select effective features more correctly by combining the advantage of information gain with classwise frequency. A malicious code detection system is implemented which adopts binary N-gram and variable-length N-gram as the feature extraction method, weighted information gain as the feature selection method. Several classifiers are used to detect malicious code in the system. Experimental results prove that this method can effectively improve the detection and accuracy rate.

Key words: data mining, variable-length N-gram, feature selection, information gain

中图分类号: