作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (8): 133-141. doi: 10.19678/j.issn.1000-3428.0068522

• 网络空间安全 • 上一篇    下一篇

基于多特征融合的智能合约缺陷检测方法

王奕丰1, 曾诚2,4, 全擎宇1, 王娇然1, 何鹏3,4,*()   

  1. 1. 湖北大学计算机与信息工程学院, 湖北 武汉 430062
    2. 湖北大学人工智能学院, 湖北 武汉 430062
    3. 湖北大学网络空间安全学院, 湖北 武汉 430062
    4. 湖北大学智能感知系统与安全教育部重点实验室, 湖北 武汉 430062
  • 收稿日期:2023-10-08 出版日期:2024-08-15 发布日期:2023-12-29
  • 通讯作者: 何鹏
  • 基金资助:
    湖北省重点研发计划(2021BAA188); 湖北省重点研发计划(2021BAA184)

Smart Contract Defect Detection Method Based on Multi-Feature Fusion

Yifeng WANG1, Cheng ZENG2,4, Qingyu QUAN1, Jiaoran WANG1, Peng HE3,4,*()   

  1. 1. School of Computer Science and Information Engineering, Hubei University, Wuhan 430062, Hubei, China
    2. School of Artificial Intelligence, Hubei University, Wuhan 430062, Hubei, China
    3. School of Cyber Science and Technology, Hubei University, Wuhan 430062, Hubei, China
    4. Key Laboratory of Intelligent Sensing System and Security, Ministry of Education, Hubei University, Wuhan 430062, Hubei, China
  • Received:2023-10-08 Online:2024-08-15 Published:2023-12-29
  • Contact: Peng HE

摘要:

智能合约是区块链技术最成功的应用之一, 随着其广泛应用, 智能合约的安全问题也引起了研究人员的关注。尽管已有一些针对智能合约缺陷检测的研究, 但对于智能合约代码特征的挖掘还不充分。提出一种采用多特征融合方式的智能合约缺陷检测方法。首先, 对智能合约代码进行预处理, 其中包括颜色标记、词汇提取、ASCII字符转换以及合约之间继承关系的提取; 然后, 将颜色标记、词汇提取、ASCII字符转换得到的处理信息输入到由BERT、卷积神经网络(CNN)以及双向长短期记忆(BiLSTM)网络构建的融合模型中进行特征提取, 同时将合约之间的继承关系信息输入node2vec随机游走算法, 以获得合约关系的特征向量; 最后, 将所有特征向量连接并输入分类器进行缺陷分类。使用真实的Solidity智能合约数据集对该方法进行验证, 实验结果表明, 相比其他模型, 所提多特征融合模型在F1值实现了6%~12%的改进, 在准确度方面实现了4%~11%的提升, 该方法能够更好地挖掘智能合约代码的深层特征, 提高缺陷检测性能, 对智能合约的安全性具有一定的应用价值。

关键词: 区块链, 智能合约, Solidity语言, 多特征, 缺陷检测

Abstract:

Smart contracts are one of the most successful applications of the blockchain technology. Owing to their widespread application, the security issues of smart contracts have attracted widespread attention from researchers. Although some studies have been conducted on defect detection in smart contracts, mining of code features in smart contracts remains insufficient. This paper introduces a smart contract defect detection method that employs a multi-feature fusion approach. First, the smart contract code undergoes preprocessing, including color labeling, vocabulary extraction, ASCII character conversion, and extraction of inheritance relationships between contracts. The processing information obtained from the first three steps is then input into a fusion model constructed using Bidirectional Encoder Representations from Transformers (BERT), Convolutional Neural Network (CNN), and Bidirectional Long Short Term Memory (BiLSTM) network for feature extraction. Simultaneously, the information on inheritance relationship between contracts is input into the node2vec random walk algorithm to obtain the feature vector of the contract relationship. Finally, all feature vectors are connected and input into the classifier for defect classification. The multi-feature fusion model is validated using a real Solidity smart contract dataset, and experimental results show that compared with other models, it achieves 6%-12% and 4%-11% improvements in the F1 value and accuracy, respectively. This method can comprehensively explore the inherent characteristics of smart contract code, improve defect detection performance, and find potential applications in preserving the security of smart contracts.

Key words: blockchain, smart contract, Solidity language, multi-feature, defect detection