计算机工程 ›› 2019, Vol. 45 ›› Issue (12): 141-146.doi: 10.19678/j.issn.1000-3428.0053136

• 安全技术 • 上一篇    下一篇

基于神经网络与代码相似性的静态漏洞检测

夏之阳1, 易平1, 杨涛2   

  1. 1. 上海交通大学 网络空间安全学院, 上海 200240;
    2. 信息网络安全公安部重点实验室, 上海 201204
  • 收稿日期:2018-11-14 修回日期:2019-02-16 发布日期:2019-02-27
  • 作者简介:夏之阳(1993-),男,硕士研究生,主研方向为软件漏洞检测、深度学习;易平,副教授、博士;杨涛,研究员、博士。
  • 基金项目:
    国家自然科学基金(61571290,61831007,61431008);国家重点研发计划(2017YFB0802900);信息网络安全公安部重点实验室开放课题项目(C18611)。

Static Vulnerability Detection Based on Neural Network and Code Similarity

XIA Zhiyang1, YI Ping1, YANG Tao2   

  1. 1. School of Cyber Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;
    2. Key Laboratory of Information Network Security of Ministry of Public Security, Shanghai 201204, China
  • Received:2018-11-14 Revised:2019-02-16 Published:2019-02-27

摘要: 静态漏洞检测通常只针对文本进行检测,执行效率高但是易产生误报。针对该问题,结合神经网络技术,提出一种基于代码相似性的漏洞检测方法。通过对程序源代码进行敏感函数定位、程序切片和变量替换等数据预处理操作,获取训练所用数据。构建基于Bi-LSTM的相似性判别模型,设定漏洞模板数据库,将待测代码与漏洞模板作比对以判别其是否存在漏洞。实验结果表明,该方法的准确率可达88.1%,误报率低至4.7%。

关键词: 软件安全, 静态漏洞检测, 深度学习, 神经网络, 代码相似性

Abstract: Static vulnerability detection is usually only used for text detection,which is efficient but prone to false positive.To address this problem,this paper proposes a vulnerability detection method based on code similarity and neural network.This paper first carries out several data preprocessing operations on the source code,such as sensitive function positioning,program slicing,variable substitution etc.,so as to obtain the data used in training.Then this paper builds the similarity discriminate model based on Bi-LSTM,sets up the vulnerability template database,and compare the code with the vulnerability template to determine whether it is vulnerable or not.Experimental results show that the accuracy of the proposed method can reach 88.1%,and the false alarm rate is reduced to 4.7%.

Key words: software safety, static vulnerability detection, deep learning, neural network, code similarity

中图分类号: