作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (9): 149-157. doi: 10.19678/j.issn.1000-3428.0069306

• 网络空间安全 • 上一篇    下一篇

基于抽象语法树嵌入的智能合约漏洞检测技术

徐瀅, 傅紫薇, 张伟, 陈云芳*()   

  1. 南京邮电大学计算机学院,江苏 南京 210023
  • 收稿日期:2024-01-26 修回日期:2024-05-11 出版日期:2025-09-15 发布日期:2025-09-26
  • 通讯作者: 陈云芳
  • 基金资助:
    国家自然科学基金(62072252)

Smart Contract Vulnerability Detection Technology Based on Abstract Syntax Tree Embedding

XU Ying, FU Ziwei, ZHANG Wei, CHEN Yunfang*()   

  1. School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, Jiangsu, China
  • Received:2024-01-26 Revised:2024-05-11 Online:2025-09-15 Published:2025-09-26
  • Contact: CHEN Yunfang

摘要:

在目前基于深度学习的智能合约漏洞检测方案中,直接使用字节码或源码进行文本序列的特征表达存在对程序语义特征理解不足的问题。基于抽象语法树(AST)嵌入的智能合约漏洞检测技术充分考虑了合约向量化表达需要的语法和语义特征以及合适的处理粒度,能够更加准确地捕捉智能合约漏洞特征。根据Solidity语法树解析设计一种AST嵌入的智能合约向量化方法,对语句级别的节点类型递归划分生成一系列语句树,然后采用递归神经网络自底向上地对每个语句树进行编码,将复杂的AST结构转化为语句级别的特征向量,在此基础上构建基于注意力机制的双向门控循环神经网络(BiGRU-ATT)模型,实现对语句树序列特征的学习,完成对重入漏洞、未校验返回值、时间戳依赖、访问权限控制和拒绝服务攻击5种典型漏洞的检测及分类。实验结果表明,基于AST嵌入的向量化方法相较于直接将源码视为文本序列进行向量化的方法在微观F1值(micro-F1)和宏观F1值(macro-F1)指标上分别提高了13和10百分点,在时间戳依赖、访问权限控制以及拒绝服务攻击漏洞分类任务中,BiGRU-ATT模型的F1值高达88%以上。

关键词: 区块链安全, 智能合约, 漏洞检测, 抽象语法树, 深度学习

Abstract:

Currently, in deep learning-based smart contract vulnerability detection solutions, the direct use of bytecode or source code for textual sequence feature representation lacks a comprehensive understanding of program semantics. The smart contract vulnerability detection technology based on Abstract Syntax Tree (AST) embedding fully considers the syntax and semantic features needed for contract vectorization and appropriate processing granularity, enabling more accurate capturing of smart contract vulnerability features. First, it employs Solidity syntax tree parsing to design a smart-contract vectorization method based on AST embedding. It partitions node types recursively at the statement level to generate sequences of statement trees. Subsequently, a recursive neural network is employed to encode each statement tree from the bottom up, transforming the intricate AST structure into statement-level feature vectors. Building on this foundation, a Bidirectional Gated Recurrent neural network model with an Attention mechanism (BiGRU-ATT) is constructed. This facilitates the learning of features from the sequences of statement trees and accomplishes the detection and categorization of five typical vulnerabilities: re-entrancy, unchecked return values, timestamp dependency, access control, and denial-of-service attacks. Experimental results demonstrate that the proposed method improves the micro-F1 and macro-F1 metrics by 13 and 10 percentage points, respectively, compared to the direct vectorization of source code as a text sequence. In tasks related to timestamp dependence, access control, and denial-of-service attack vulnerability classification, the BiGRU-ATT model with the attention mechanism achieves an F1 value of over 88%.

Key words: blockchain security, smart contract, vulnerability detection, Abstract Syntax Tree (AST), deep learning