基于神经网络与代码相似性的静态漏洞检测

doi:10.19678/j.issn.1000-3428.0053136

计算机工程 ›› 2019, Vol. 45 ›› Issue (12): 141-146. doi: 10.19678/j.issn.1000-3428.0053136

基于神经网络与代码相似性的静态漏洞检测

夏之阳¹, 易平¹, 杨涛²

1. 上海交通大学网络空间安全学院, 上海 200240;
2. 信息网络安全公安部重点实验室, 上海 201204

收稿日期:2018-11-14 修回日期:2019-02-16 发布日期:2019-02-27
作者简介:夏之阳(1993-),男,硕士研究生,主研方向为软件漏洞检测、深度学习;易平,副教授、博士;杨涛,研究员、博士。
基金资助:
国家自然科学基金（61571290，61831007，61431008）；国家重点研发计划（2017YFB0802900）；信息网络安全公安部重点实验室开放课题项目（C18611）。

Static Vulnerability Detection Based on Neural Network and Code Similarity

XIA Zhiyang¹, YI Ping¹, YANG Tao²

1. School of Cyber Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;
2. Key Laboratory of Information Network Security of Ministry of Public Security, Shanghai 201204, China

Received:2018-11-14 Revised:2019-02-16 Published:2019-02-27

摘要/Abstract

摘要： 静态漏洞检测通常只针对文本进行检测，执行效率高但是易产生误报。针对该问题，结合神经网络技术，提出一种基于代码相似性的漏洞检测方法。通过对程序源代码进行敏感函数定位、程序切片和变量替换等数据预处理操作，获取训练所用数据。构建基于Bi-LSTM的相似性判别模型，设定漏洞模板数据库，将待测代码与漏洞模板作比对以判别其是否存在漏洞。实验结果表明，该方法的准确率可达88.1%，误报率低至4.7%。

关键词: 软件安全, 静态漏洞检测, 深度学习, 神经网络, 代码相似性

Abstract: Static vulnerability detection is usually only used for text detection,which is efficient but prone to false positive.To address this problem,this paper proposes a vulnerability detection method based on code similarity and neural network.This paper first carries out several data preprocessing operations on the source code,such as sensitive function positioning,program slicing,variable substitution etc.,so as to obtain the data used in training.Then this paper builds the similarity discriminate model based on Bi-LSTM,sets up the vulnerability template database,and compare the code with the vulnerability template to determine whether it is vulnerable or not.Experimental results show that the accuracy of the proposed method can reach 88.1%,and the false alarm rate is reduced to 4.7%.

Key words: software safety, static vulnerability detection, deep learning, neural network, code similarity

中图分类号:

TP309

夏之阳, 易平, 杨涛. 基于神经网络与代码相似性的静态漏洞检测[J]. 计算机工程, 2019, 45(12): 141-146.

XIA Zhiyang, YI Ping, YANG Tao. Static Vulnerability Detection Based on Neural Network and Code Similarity[J]. Computer Engineering, 2019, 45(12): 141-146.

http://www.ecice06.com/CN/Y2019/V45/I12/141

图/表 9

20191214132617

20191214132620

20191214132623

20191214132625

20191214132628

20191214132630

20191214132634

20191214132636

20191214132639

参考文献

[1] LIU Jian,SU Purui,YANG Min,et al.Software and cyber security-a survey[J].Journal of Software,2018,29(1):42-68.(in Chinese)刘剑,苏璞睿,杨珉,等.软件与网络安全研究综述[J].软件学报,2018,29(1):42-68.
[2] YU Zhuliang.Review of progress on artificial intelligence[J].Journal of Nanjing University of Information Science and Technology,2017,9(3):297-304.(in Chinese)俞祝良.人工智能技术发展概述[J].南京信息工程大学学报(自然科学版),2017,9(3):297-304.
[3] LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[4] SONG Chaochen,HUANG Junqiang,Wang Dameng,et al.A survey on detecting techniques of computer security vulnerability[J].Netinfo Security,2012(1):77-79.(in Chinese)宋超臣,黄俊强,王大萌,等.计算机安全漏洞检测技术综述[J].信息网络安全,2012(1):77-79.
[5] ZOU Quanchen,ZHANG Tao,WU Runpu,et al.From automation to intelligence:progress in software vulnerability mining technology[J].Journal of Tsinghua University (Natural Science Edition),2018,58(12):1079-1094.(in Chinese)邹权臣,张涛,吴润浦,等.从自动化到智能化:软件漏洞挖掘技术进展[J].清华大学学报(自然科学版),2018,58(12):1079-1094.
[6] Flawfinder[EB/OL].[2018-11-02].https://dwheeler.com/flawfinder/.
[7] Coverity[EB/OL].[2018-11-02].https://scan.coverity.com/.
[8] Checkmarx[EB/OL].[2018-11-02].https://www.checkmarx.com/.
[9] GRIECO G,GRINBLAT G L,UZAL L,et al.Toward large-scale vulnerability discovery using machine learning[C]//Proceedings of ACM Conference on Data and Application Security and Privacy.New York,USA:ACM Press,2016:85-96.
[10] LIN Guanjun,ZHANG Jun,LUO Wei,et al.POSTER:vulnerability discovery with function representation learning from unlabeled projects[C]//Proceedings of ACM SIGSAC Conference.New York,USA:ACM Press,2017:2539-2541.
[11] XU Xiaojun,LIU Chang,FENG Qian,et al.Neural network-based graph embedding for cross-platform binary code similarity detection[C]//Proceedings of 2017 ACM SIGSAC Conference on Computer and Communications Security.New York,USA:ACM Press,2017:363-376.
[12] LI Zhen,ZOU Deqing,XU Shouhuai,et al.VulDeePecker:a deep learning-based system for vulnerability detection[EB/OL].[2018-11-02].https://arxiv.org/pdf/1801.01681.pdf.
[13] ZHANG Yanmei.Research on testing technology of object-oriented programs based on dependency analysis[D].Xuzhou:China University of Mining and Technology,2012.(in Chinese)张艳梅.基于依赖性分析的面向对象程序测试技术研究[D].徐州:中国矿业大学,2012.
[14] Gensim[EB/OL].[2018-11-02].https://radimrehurek.com/gensim/.
[15] MIKOLOV T,CHEN Kai,CORRADO G,et al.Efficient estimation of word representations in vector space[EB/OL].[2018-11-02].https://arxiv.org/abs/1301.3781.
[16] GRAVES A,JAITLY N,MOHAMED A R.Hybrid speech recognition with deep bidirectional LSTM[C]//Proceedings of 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.Washington D.C.,USA:IEEE Press,2014:273-278.
[17] HINTON G E,SRIVASTAVA N,KRIZHEVSKY A,et al.Improving neural networks by preventing co-adaptation of feature detectors[EB/OL].[2018-11-02].https://arxiv.org/pdf/1207.0580.pdf.
[18] IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[EB/OL].[2018-11-02].https://arxiv.org/pdf/1502.03167.pdf.
[19] KINGMA D P,BA J.Adam:a method for stochastic optimization[EB/OL].[2018-11-02].https://arxiv.org/pdf/1412.6980.pdf.
[20] SARD manual[EB/OL].[2018-11-02].https://samate.nist.gov/index.php/SARD.html.

选择文件类型/文献管理软件名称

选择包含的内容

基于神经网络与代码相似性的静态漏洞检测

Static Vulnerability Detection Based on Neural Network and Code Similarity

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	江雨燕, 陶承凤, 李平. 数据增强和自适应自步学习的深度子空间聚类算法[J]. 计算机工程, 2023, 49(8): 96-103, 110.
[2]	李泽水, 冀俊忠, 杨翠翠. 基于边权重信息深度网络嵌入的PPIN功能模块检测[J]. 计算机工程, 2023, 49(8): 69-76.
[3]	王可铮, 徐玉芬, 周尚波. 结合对比感知损失和融合注意力的图像去雾模型[J]. 计算机工程, 2023, 49(8): 207-214.
[4]	刘俊豪, 王美林, 谢兴, 宋烨兴, 许莉花. 基于改进YOLOv5的皮革瑕疵检测算法[J]. 计算机工程, 2023, 49(8): 240-249.
[5]	靳雁霞, 史志儒, 杨晶, 刘亚变, 乔星宇, 张翎. 布料与精细建模物体间的碰撞检测算法研究[J]. 计算机工程, 2023, 49(7): 269-277.
[6]	曹坪, 杨怀志, 薄一军, 尤嘉, 张淳杰, 李丹勇. 面向低质量裂缝图像的多知识蒸馏分类[J]. 计算机工程, 2023, 49(7): 204-213.
[7]	白明昌. 基于折叠路径聚合的属性网络节点嵌入方法[J]. 计算机工程, 2023, 49(7): 76-84.
[8]	闫兴亚, 匡娅茜, 白光睿, 李月. 基于深度学习的学生课堂行为识别方法[J]. 计算机工程, 2023, 49(7): 251-258.
[9]	赵世豪, 毛国君, 熊保平, 黄山, 林江宏. 基于图小波卷积神经网络的时空图挖掘模型[J]. 计算机工程, 2023, 49(7): 85-93.
[10]	李军侠, 王星驰, 殷梓, 石德硕. 边缘深度挖掘的弱监督显著性目标检测[J]. 计算机工程, 2023, 49(7): 169-178.
[11]	吴珊, 周凤. 基于改进SSD算法的小目标检测[J]. 计算机工程, 2023, 49(7): 179-188.
[12]	席建锐, 唐红梅, 梁春阳, 刘鑫. 基于改进隐函数的点云物体重建[J]. 计算机工程, 2023, 49(7): 214-222.
[13]	齐咏生, 杜晓旭, 朱俊峰, 高胜利, 刘利强. 基于增强型轻量深度网络的牧区牲畜高效检测[J]. 计算机工程, 2023, 49(7): 278-287.
[14]	谌雨章, 黄逸姿, 张钧涵. 基于多速率空洞卷积的多尺度水下小目标检测[J]. 计算机工程, 2023, 49(6): 257-264.
[15]	廖涛, 孙皓洁, 张顺香. 基于跨度和特征融合的实体关系联合抽取模型[J]. 计算机工程, 2023, 49(6): 107-114.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于神经网络与代码相似性的静态漏洞检测

Static Vulnerability Detection Based on Neural Network and Code Similarity

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献

相关文章 15

编辑推荐

Metrics

本文评价