基于抽象语法树嵌入的智能合约漏洞检测技术

doi:10.19678/j.issn.1000-3428.0069306

摘要/Abstract

摘要：

在目前基于深度学习的智能合约漏洞检测方案中，直接使用字节码或源码进行文本序列的特征表达存在对程序语义特征理解不足的问题。基于抽象语法树(AST)嵌入的智能合约漏洞检测技术充分考虑了合约向量化表达需要的语法和语义特征以及合适的处理粒度，能够更加准确地捕捉智能合约漏洞特征。根据Solidity语法树解析设计一种AST嵌入的智能合约向量化方法，对语句级别的节点类型递归划分生成一系列语句树，然后采用递归神经网络自底向上地对每个语句树进行编码，将复杂的AST结构转化为语句级别的特征向量，在此基础上构建基于注意力机制的双向门控循环神经网络(BiGRU-ATT)模型，实现对语句树序列特征的学习，完成对重入漏洞、未校验返回值、时间戳依赖、访问权限控制和拒绝服务攻击5种典型漏洞的检测及分类。实验结果表明，基于AST嵌入的向量化方法相较于直接将源码视为文本序列进行向量化的方法在微观F1值(micro-F1)和宏观F1值(macro-F1)指标上分别提高了13和10百分点，在时间戳依赖、访问权限控制以及拒绝服务攻击漏洞分类任务中，BiGRU-ATT模型的F1值高达88%以上。

关键词: 区块链安全, 智能合约, 漏洞检测, 抽象语法树, 深度学习

Abstract:

Currently, in deep learning-based smart contract vulnerability detection solutions, the direct use of bytecode or source code for textual sequence feature representation lacks a comprehensive understanding of program semantics. The smart contract vulnerability detection technology based on Abstract Syntax Tree (AST) embedding fully considers the syntax and semantic features needed for contract vectorization and appropriate processing granularity, enabling more accurate capturing of smart contract vulnerability features. First, it employs Solidity syntax tree parsing to design a smart-contract vectorization method based on AST embedding. It partitions node types recursively at the statement level to generate sequences of statement trees. Subsequently, a recursive neural network is employed to encode each statement tree from the bottom up, transforming the intricate AST structure into statement-level feature vectors. Building on this foundation, a Bidirectional Gated Recurrent neural network model with an Attention mechanism (BiGRU-ATT) is constructed. This facilitates the learning of features from the sequences of statement trees and accomplishes the detection and categorization of five typical vulnerabilities: re-entrancy, unchecked return values, timestamp dependency, access control, and denial-of-service attacks. Experimental results demonstrate that the proposed method improves the micro-F1 and macro-F1 metrics by 13 and 10 percentage points, respectively, compared to the direct vectorization of source code as a text sequence. In tasks related to timestamp dependence, access control, and denial-of-service attack vulnerability classification, the BiGRU-ATT model with the attention mechanism achieves an F1 value of over 88%.

Key words: blockchain security, smart contract, vulnerability detection, Abstract Syntax Tree (AST), deep learning

徐瀅, 傅紫薇, 张伟, 陈云芳. 基于抽象语法树嵌入的智能合约漏洞检测技术[J]. 计算机工程, 2025, 51(9): 149-157.

XU Ying, FU Ziwei, ZHANG Wei, CHEN Yunfang. Smart Contract Vulnerability Detection Technology Based on Abstract Syntax Tree Embedding[J]. Computer Engineering, 2025, 51(9): 149-157.

https://www.ecice06.com/CN/Y2025/V51/I9/149

图/表 9

图1 智能合约漏洞检测及分类技术路线

Fig.1 Smart contract vulnerability detection and classification technical approach

图2 智能合约AST结构示例

Fig.2 Smart contract AST structure example

图3 不同模型性能对比

Fig.3 Performance comparison of different models

参考文献 26

1	张亮, 刘百祥, 张如意, 等. 区块链技术综述. 计算机工程, 2019, 45 (5): 1- 12. doi: 10.19678/j.issn.1000-3428.0053554
	ZHANG L , LIU B X , ZHANG R Y , et al. Overview of blockchain technology. Computer Engineering, 2019, 45 (5): 1- 12. doi: 10.19678/j.issn.1000-3428.0053554
2	ChainAegis. Annual Web3 security report 2022[EB/OL]. [2023-10-12]. https://m.chainaegis.com/.
3	LÓPEZ V A , SANDOVAL O A L , GARCÍA V L J . A security framework for Ethereum smart contracts. Computer Communications, 2021, 172, 119- 129. doi: 10.1016/j.comcom.2021.03.008
4	PRAITHEESHAN P, PAN L, YU J S, et al. Security analysis methods on Ethereum smart contract vulnerabilities: a survey[EB/OL]. [2023-10-12]. https://arxiv.org/abs/1908.08605v3.
5	TANG X , DU Y , LAI A , et al. Deep learning-based solution for smart contract vulnerabilities detection. Scientific Reports, 2023, 13 (1): 20106. doi: 10.1038/s41598-023-47219-0
6	WANG W , SONG J J , XU G Q , et al. ContractWard: automated vulnerability detection models for Ethereum smart contracts. IEEE Transactions on Network Science and Engineering, 2020, 8 (2): 1133- 1144.
7	YANG B X. Research on dynamic detection of vulnerabilities in smart contracts based on machine learning[C]//Proceedings of the IEEE 3rd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA). Washington D. C., USA: IEEE Press, 2024: 219-223.
8	TANN W J, HAN X J, GUPTA S S, et al. Towards safer smart contracts: a sequence learning approach to detecting security threats[EB/OL]. [2023-10-12]. https://arxiv.org/abs/1811.06632v3.
9	QIAN P , LIU Z G , HE Q M , et al. Towards automated reentrancy detection for smart contracts based on sequential models. IEEE Access, 2020, 8, 19685- 19695. doi: 10.1109/ACCESS.2020.2969429
10	ZHUANG Y, LIU Z G, QIAN P, et al. Smart contract vulnerability detection using graph neural network[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama, Japan: [s. n. ], 2021: 3283-3290.
11	黄晓伟, 范贵生, 虞慧群, 等. 基于重子节点抽象语法树的软件缺陷预测. 计算机工程, 2021, 47 (12): 230-235, 248. doi: 10.19678/j.issn.1000-3428.0060389
	HUANG X W , FAN G S , YU H Q , et al. Software defect prediction via heavy son node-based abstract syntax tree. Computer Engineering, 2021, 47 (12): 230-235, 248. doi: 10.19678/j.issn.1000-3428.0060389
12	YANG H W, ZHANG J M, GU X G, et al. Smart contract vulnerability detection based on abstract syntax tree[C]//Proceedings of the 8th International Symposium on System Security, Safety, and Reliability (ISSSR). Washington D. C., USA: IEEE Press, 2022: 169-170.
13	MITTAL A, WIDJAJA G, COSME P R D, et al. Blockchain based abstract syntax tree to detect vulnerability in IoT-enabled smart contract[C]//Proceedings of the 2nd International Conference on Smart Technologies for Smart Nation (SmartTechCon). Washington D. C., USA: IEEE Press, 2023: 270-275.
14	WHITE M, TUFANO M, VENDOME C, et al. Deep learning code fragments for code clone detection[C]//Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. New York, USA: ACM Press, 2016: 87-98.
15	SHAKYA S, MUKHERJEE A, HALDER R, et al. SmartMixModel: machine learning-based vulnerability detection of Solidity smart contracts[C]//Proceedings of the IEEE International Conference on Blockchain. Washington D. C., USA: IEEE Press, 2022: 37-44.
16	ZHANG J, WANG X, ZHANG H Y, et al. A novel neural source code representation based on abstract syntax tree[C]//Proceedings of the IEEE/ACM 41st International Conference on Software Engineering (ICSE). Washington D. C., USA: IEEE Press, 2019: 783-794.
17	KEN D. grammars-v4[EB/OL]. [2023-10-12]. https://gi-thub.com/antlr/grammars-v4/tree/master/solidity.
18	MA J, GAO W, WONG K F. Rumor detection on twitter with tree-structured recursive neural networks[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, USA: ACL, 2018: 1-10.
19	DURIEUX T, FERREIRA J F, ABREU R, et al. Empirical review of automated analysis tools on 47, 587 Ethereum smart contracts[C]//Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. New York, USA: ACM Press, 2020: 530-541.
20	YUAN Y, XIE T Y. SVChecker: a deep learning-based system for smart contract vulnerability detection[C]//Proceedings of International Conference on Computer Application and Information Security (ICCAIS 2021). Wuhan, China: SPIE, 2022: 226-231.
21	HWANG S J , CHOI S H , SHIN J , et al. CodeNet: code-targeted convolutional neural network architecture for smart contract vulnerability detection. IEEE Access, 2022, 10, 32595- 32607. doi: 10.1109/ACCESS.2022.3162065
22	WU H J, ZHANG Z, WANG S W, et al. Peculiar: smart contract vulnerability detection based on crucial data flow graph and pre-training techniques[C]//Proceedings of the IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). Washington D. C., USA: IEEE Press, 2021: 378-389.
23	LIU Z G, QIAN P, WANG X Y, et al. Combining graph neural networks with expert knowledge for smart contract vulnerability detection[EB/OL]. [2023-10-12]. https://arxiv.org/pdf/2107.11598.
24	FEIST J, GRIECO G, GROCE A. Slither: a static analysis framework for smart contracts[C]//Proceedings of the IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB). Washington D. C., USA: IEEE Press, 2019: 8-15.
25	TIKHOMIROV S, VOSKRESENSKAYA E, IVANITSKIY I, et al. SmartCheck: static analysis of ethereum smart contracts[C]//Proceedings of the 1st International Workshop on Emerging Trends in Software Engineering for Blockchain. New York, USA: ACM Press, 2018: 9-16.
26	Mythril. A framework for bug hunreshain[EB/OL].[2023-10-12]. https://mythx.io/.

[1]	黄金贵, 刘朋, 唐文胜. MMD-YOLOv7:黑暗条件下车辆检测方法[J]. 计算机工程, 2025, 51(9): 340-349.
[2]	周晨阳, 刘雪宇, 梁少华, 吴永飞. 基于Swin Transformer的肾动脉血管检测分割与定量分析[J]. 计算机工程, 2025, 51(9): 252-267.
[3]	张黔会, 袁凌云, 谢天玉, 吴加英. 智能合约驱动的公平可验证秘密共享[J]. 计算机工程, 2025, 51(9): 177-191.
[4]	武东辉, 王金凤, 仇森, 刘国志. 基于EWBiLSTM-ATT的数据手套手语识别[J]. 计算机工程, 2025, 51(8): 107-119.
[5]	林帆, 李建华. 基于多阶门控聚合网络的光学化学结构识别[J]. 计算机工程, 2025, 51(8): 364-372.
[6]	赵楷, 胡煜环, 闫俊桥, 毕雪华, 张琳琳. 基于区块链的版权保护研究综述[J]. 计算机工程, 2025, 51(8): 1-15.
[7]	郝宏达, 罗健旭. 基于多尺度区域特征融合的多器官语义分割模型[J]. 计算机工程, 2025, 51(8): 270-280.
[8]	武东辉, 王金凤, 仇森, 刘国志. 基于EWBiLSTM-ATT的数据手套手语识别[J]. 计算机工程, 2025, 51(8): 107-119.
[9]	沙宇洋, 陆京涛, 杜浩凡, 翟小兵, 孟维宇, 廉旭, 罗刚, 李克峰. 适用于导盲场景的多尺度特征融合轻量化道路图像分割算法[J]. 计算机工程, 2025, 51(7): 314-325.
[10]	余鹏, 杨佳琦, 陈欣然, 贺超波. 基于二部图对比学习的特征增强推荐算法[J]. 计算机工程, 2025, 51(7): 100-110.
[11]	欧阳昱中, 韩锐, 刘驰. 边缘侧领域自适应中长尾视觉识别技术研究[J]. 计算机工程, 2025, 51(7): 171-179.
[12]	李姜辛, 王鹏, 汪卫. 多机理指导的深度学习工业时序预测框架[J]. 计算机工程, 2025, 51(7): 47-58.
[13]	周哲臣, 胡冀苏, 钱旭升, 郑毅, 戴亚康, 周志勇. 基于查询自适应双层自注意力机制的MRI脑组织分割[J]. 计算机工程, 2025, 51(7): 294-304.
[14]	孟波, 史旭华, 张彬. 基于双分支卷积和深度插值的点云表面重建[J]. 计算机工程, 2025, 51(7): 119-126.
[15]	周莎, 车生兵, 考友琛, 张旭, 郭甚驿. 基于特征选择和时空特征的网络入侵检测[J]. 计算机工程, 2025, 51(7): 223-231.

选择文件类型/文献管理软件名称

选择包含的内容