Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2026, Vol. 52 ›› Issue (3): 243-254. doi: 10.19678/j.issn.1000-3428.0070175

• Multimodal Information Fusion • Previous Articles     Next Articles

Research on Android Malware Detection Model Based on Multi-modal Feature Fusion

ZHANG Zhi, YIN Yukai*(), SUN Yiling, MENG Wenjing, PENG Chang   

  1. School of Information Engineering, Zhongnan University of Economics and Law, Wuhan 430073, Hubei, China
  • Received:2024-07-25 Revised:2024-10-07 Online:2026-03-15 Published:2024-12-19
  • Contact: YIN Yukai

基于多模态特征融合的Android恶意软件检测模型研究

张志, 尹昱凯*(), 孙奕灵, 孟雯锦, 彭畅   

  1. 中南财经政法大学信息工程学院, 湖北 武汉 430073
  • 通讯作者: 尹昱凯
  • 作者简介:

    张志(CCF会员),女,副教授、博士,主研方向为人工智能、信息安全、隐私计算

    尹昱凯(通信作者),硕士研究生

    孙奕灵,本科生

    孟雯锦,硕士研究生

    彭畅,硕士研究生

  • 基金资助:
    中央高校基本科研业务费专项资金(2722024BY022); 中南财经政法大学研究生拔尖人才培养项目(YJ20230043); 中南财经政法大学中央高校基本科研业务费专项资金(202451402)

Abstract:

Owing to the heterogeneity and complexity of Android malware, traditional static analysis methods that rely on single features such as permissions or API often struggle to accurately differentiate between benign and malicious applications. To address this limitation, this study proposes a novel feature construction method based on multi-modal feature fusion based on in-depth research of Android software features such as permissions, API, bytecodes, and opcodes. The bytecode is transformed into RGB images and visual representations are extracted using the pretrained EfficientNetV2B3 model to capture the high-level characteristics of Android applications. Additionally, Locality-Sensitive Hashing (LSH) is employed to extract opcode sequence features that represent low-level, detailed characteristics of the application. These heterogeneous features are then fused using a Multimodal Factorized Bilinear pooling (MFB) algorithm to create a more discriminative representation of the malware. Building on this enhanced feature representation, a Transformer Encoder-based Android Anomaly Detection (TEAAD) model is introduced. By leveraging the transformer architecture, the TEAAD effectively learns to detect anomalies in Android malware. The experimental results demonstrate that the TEAAD model based on fused features outperforms other deep-learning models, achieving a detection accuracy of 96.87%. The MFB feature fusion method exhibits superior malware identification capabilities compared with other research methods.

Key words: Android malware, pre-trained model, Locality-Sensitive Hashing (LSH), feature fusion, deep learning

摘要:

针对Android恶意软件种类和结构繁杂不一、单一静态特征难以区分良性和恶意软件的问题, 在深入研究Android软件的权限、API、字节码、操作码等特征的基础上, 提出一种基于多模态特征融合的构建方法。将字节码转换为RGB图像, 通过预训练模型EfficientNetV2B3提取字节码图像特征, 以表征Android应用的整体特性。利用局部敏感哈希(LSH)算法提取操作码序列特征, 以表征Android应用的细节特性。采用多模态分解双线性池化(MFB)融合算法对字节码图像特征和操作码序列特征进行融合, 实现2种特征数据的异质互补, 以得到更具区分度的静态特征。在此基础上, 提出一种基于Transformer的Android恶意软件检测模型(TEAAD)。实验结果表明, 基于融合特征的TEAAD模型优于其他深度模型, 检测准确率达到96.87%, MFB特征融合方法相较于其他方法具有更高的恶意软件识别能力。

关键词: Android恶意软件, 预训练模型, 局部敏感哈希, 特征融合, 深度学习