熵可视化方法在恶意代码分类中的应用

doi:10.3969/j.issn.1000-3428.2017.09.030

计算机工程

熵可视化方法在恶意代码分类中的应用

任卓君,陈光

(东华大学信息科学与技术学院,上海 201620)

收稿日期:2016-10-14 出版日期:2017-09-15 发布日期:2017-09-15
作者简介:任卓君(1984—),女,博士研究生,主研方向为网络与信息安全;陈光,教授、博士。
基金资助:
国家自然科学基金(61671006);中央高校基本科研业务费专项资金(14D310407)。

Application of Entropy Visualization Method in Malware Classification

REN Zhuojun,CHEN Guang

(College of Information Science and Technology,Donghua University,Shanghai 201620,China)

Received:2016-10-14 Online:2017-09-15 Published:2017-09-15

摘要/Abstract

摘要： 恶意代码激增极大地威胁着信息系统安全。为提高辨识效率,加快应急响应速度,结合信息熵的定义,利用Jaccard度量和K最近邻分类算法,提出一种新的用于研究恶意代码分类的可视化方法。将二进制文件经局部熵计算转换成熵像素图,从视觉角度直观呈现恶意代码内部特征,通过降维显示机制提高相似度比对和分类的效率。实验结果表明,该方法使用66个族的664个由卡巴斯基命名规则命名的样本进行评估,平均分类准确率为93.67%,能有效地分类恶意代码样本。

关键词: 恶意代码, 可视化, 谱系分类, 信息熵, Jaccard指数, K最近邻分类算法

Abstract: Soaring malwares threat the security of information systems.For increasing identification efficiency and improving response speed,this paper presents a new malware visualization method for classification based on Shannon entropy,Jaccard index and K-Nearest Neighbor(KNN) algorithm.This method transforms binary files into entropy pixel images by computing the local entropy values of samples to show the inner features of malwares directly in the visual mode,and uses dimension reduction for display to accelerate the process of similarity and classification analysis.Experimental results show that the method is quite promising with 93.67% classification accuracy on 664 samples named by Kaspersky of 66 different families,it can classify malware families effectively.

Key words: malware, visualization, pedigree classification, information entropy, Jaccard index, K-Nearest Neighbor(KNN) classification algorithm

中图分类号:

TP309

任卓君,陈光. 熵可视化方法在恶意代码分类中的应用[J]. 计算机工程, doi: 10.3969/j.issn.1000-3428.2017.09.030.

REN Zhuojun,CHEN Guang. Application of Entropy Visualization Method in Malware Classification[J]. Computer Engineering, doi: 10.3969/j.issn.1000-3428.2017.09.030.

http://www.ecice06.com/CN/Y2017/V43/I9/167

参考文献

参考文献［1］瑞星安全资讯.Stuxnet病毒全球肆虐将影响我国众多企业［EB/OL］.［2016-10-07］.http:// www.rising.com.cn/about/news/rising/2010-09-25/8226.html. ［2］COCHIN C,CRUZ B,DENNEDY M,et al.McAfee Labs Threat Report［Z］.［S.1.］:McAfee Corporation,2014. ［3］WOOD P,NAHORNEY B,CHANDRASEKAR K,et al.Internet Security Threat Report［Z］.［S.1.］:Symantec Corporation,2016. ［4］RIECK K,HOLZ T,WILLEMS C,et al.Learning and Classification of Malware Behavior［C］//Proceedings of the 5th Conference on Detection of Intrusions and Malware & Vulnerability Assessment.Paris,France:［s.n.］,2008:215-223. ［5］TIAN R,BATTEN L,VERSTEEG S,et al.Function Length as a Tool for Malware Classification［C］//Proceedings of the 3rd International Conference on Malicious and Unwanted Software.Los Alamitos,USA:IEEE Press,2008:369-378. ［6］TIAN R,BATTEN L,ISLAM R,et al.An Automated Classification System Based on the Strings of Trojan and Virus Families［C］//Proceedings of the 4rd International Conference on Malicious and Unwanted Software.New York,USA:ACM Press,2009:459-468. ［7］ISLAM R,TIAN R,BATTEN L,et al.Classification of Malware Based on String and Function Feature Selection［C］//Proceedings of the 2nd Cybercrime and Trustworthy Computing Workshop.Ballarat,Australia:IEEE Press,2010:159-167. ［8］PARK Y,REEVES D,MULUKUTLA V,et al.Fast Malware Classification by Automated Behavioral Graph Matching ［C］//Proceedings of the 6th Annual Workshop on Cyber Security and Information Intelligent Research.Oak Ridge,USA:ACM Press,2010:156-165. ［9］岳峰,庞建民,赵荣彩,等.反汇编过程中call指令后混淆数据的识别［J］.计算机工程,2010,36(7):144-146. ［10］王新志,孙乐昌,张旻,等.基于序列模式发现的恶意行为检测方法［J］.计算机工程,2011,37(24):1-3. ［11］张一弛,庞建民,范学斌,等.基于模型检测的程序恶意行为识别方法［J］.计算机工程,2012,38(18):107-110. ［12］INSEON Y.Visualizing Windows Executable Viruses Using Self-organizing Maps［C］//Proceedings of ACM Workshop on Visualization and Data Mining for Computer Security.New York,USA:ACM Press,2004:154-166. ［13］THOMAS P.Signature Visualization of Software Binaries［C］//Proceedings of the 4th ACM Symposium on Software Visualization.New York,USA:ACM Press,2008:246-257. ［14］DANIEL Q,LORIE L.Visualizing Compiled Executables for Malware Analysis［C］//Proceedings of IEEE Workshop on Visualization for Cyber Security.New Jersey,USA:IEEE Press,2009:258-267. ［15］PHILIPP T,THORSTEN H,JAN G,et al.Visual Analysis of Malware Behavior Using Treemaps and Thread Graphs［C］//Proceedings of 2009 IEEE Workshop on Visualization for Cyber Security.New Jersey,USA:IEEE Press,2009:547-558. ［16］WEI Z,YACIN N.MalwareVis:Entry-based Visualization of Malware Network Traces［C］//Proceedings of VizSec’12.Seattle,USA:ACM Press,2012:354-363. ［17］CONTI G,DEAN E,SINDA M,et al.Visual Reverse Engineering of Binary and Data Files［C］//Proceedings of VizSec’08.Cambridge,USA:IEEE Press,2008:265-278. ［18］NATARJ L,KARTHIKEYAN S,Jacob G,et al.Malware Images:Visualization and Automatic Classification［C］//Proceedings of VizSec’11.Pittsburgh,USA:IEEE Press,2011:152-168. 编辑索书志

[1]	陈何雄, 罗宇薇, 韦云凯, 郭威, 杭菲璐, 何映军, 杨宁. 基于联邦学习的SDN异常流量协同检测技术[J]. 计算机工程, 2023, 49(3): 168-176.
[2]	朱晓强, 陈琦. 基于可控卷积曲面的三维神经元建模[J]. 计算机工程, 2023, 49(3): 231-237.
[3]	李晨曦, 任建国. 基于点对群网络反馈机制的恶意代码传播模型[J]. 计算机工程, 2023, 49(1): 163-172.
[4]	孙福禄, 王宇嘉, 刘子怡. 基于节点引力与鱼记忆的社区检测算法[J]. 计算机工程, 2022, 48(5): 104-111.
[5]	李柯, 李邵梅, 吉立新, 刘硕. 基于自注意力胶囊网络的伪造人脸检测方法[J]. 计算机工程, 2022, 48(2): 194-200,206.
[6]	汤文琳, 谢凯, 文畅, 贺建飚. 深度聚类索引下的海量地震数据快速三维可视化[J]. 计算机工程, 2022, 48(11): 275-283.
[7]	姚铁锤, 王珏, 王彦棡, 迟学斌, 王晓光. 基于云端可视化交互的强化学习平台[J]. 计算机工程, 2021, 47(5): 316-320.
[8]	陈佳捷, 彭伯庄, 吴佩泽. 基于动态行为和机器学习的恶意代码检测方法[J]. 计算机工程, 2021, 47(3): 166-173.
[9]	朱映波, 赵阳洋, 王佩, 尹凯, 王振宇. 融合马尔科夫决策过程与信息熵的对话策略[J]. 计算机工程, 2021, 47(3): 284-290.
[10]	王旭, 陈永乐, 王庆生, 陈俊杰. 结合特征选择与集成学习的密码体制识别方案[J]. 计算机工程, 2021, 47(1): 139-145,153.
[11]	何高峰, 司勇瑞, 徐丙凤. 针对Android移动应用的恶意加密流量标注方法研究[J]. 计算机工程, 2020, 46(7): 116-121,128.
[12]	王鑫, 傅强, 王林, 徐大为, 王昊奋. 知识图谱可视化查询技术综述[J]. 计算机工程, 2020, 46(6): 1-11.
[13]	赵军, 朱荽, 杨雯璟, 许彦辉, 庞宇. 一种基于密度峰值聚类的图像分割算法[J]. 计算机工程, 2020, 46(2): 274-278,285.
[14]	苏昊翔, 董正宏, 杨帆, 刘立昊. 基于Cesium的卫星载荷可视化仿真分析平台[J]. 计算机工程, 2020, 46(10): 193-200.
[15]	张景莲, 彭艳兵. 基于特征融合的恶意代码分类研究[J]. 计算机工程, 2019, 45(8): 281-286,295.

选择文件类型/文献管理软件名称

选择包含的内容

熵可视化方法在恶意代码分类中的应用

Application of Entropy Visualization Method in Malware Classification

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

熵可视化方法在恶意代码分类中的应用

Application of Entropy Visualization Method in Malware Classification

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价