作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (7): 232-243. doi: 10.19678/j.issn.1000-3428.0069837

• 网络空间安全 • 上一篇    下一篇

面向中文的多层次扰动定位文本对抗样本生成方法

侯彦1, 车蕾1,*(), 李慧2   

  1. 1. 北京信息科技大学信息管理学院, 北京 100192
    2. 中关村智用人工智能研究院, 北京 100020
  • 收稿日期:2024-05-13 出版日期:2025-07-15 发布日期:2024-10-14
  • 通讯作者: 车蕾
  • 基金资助:
    非遗数字化与多源信息融合福建省高校工程研究中心2023年度开放基金项目(G3-KF2301)

Text Adversarial Sample Generation Method for Chinese Language with Multi-level Perturbation Localization

HOU Yan1, CHE Lei1,*(), LI Hui2   

  1. 1. School of Information Management, Beijing University of Information Technology, Beijing 100192, China
    2. Zhongguancun Intelligent Artificial Intelligence Research Institute, Beijing 100020, China
  • Received:2024-05-13 Online:2025-07-15 Published:2024-10-14
  • Contact: CHE Lei

摘要:

为提升中文领域黑盒攻击下生成对抗样本过程中扰动定位精度, 并解决现有方法在词重要度评估中忽视上下文关联度和语义密度的问题, 提出一种具有多层次扰动定位能力的中文文本对抗样本生成方法(MDLM)。首先, 通过整合多源异构深度学习模型, 构建一套融合不同特征提取能力的多层次判定模型; 其次, 在词重要度评估上新增3种评估函数, 从多个维度评估词的重要度; 最后, 通过多层次判定模型与评估函数共同作用实现对原始文本扰动点的精准定位。在文本对抗样本生成策略上, MDLM融合了繁体字、拼音、多音词、同音词等多种文本替换策略, 旨在确保攻击成功率的同时, 提升生成对抗样本的多样性。实验结果显示, MDLM在多个数据集上针对多个目标模型进行攻击时扰动效果显著, 最高攻击扰动率达到了43.5%, 进一步增强了对抗样本的攻击能力。同时, 针对多层次扰动定位能力的消融实验结果显示, 将评估函数与判定模型进行多层次组合可以显著提高生成对抗样本的攻击效果。

关键词: 黑盒攻击, 扰动定位, 判定模型, 词重要度评估, 对抗样本生成

Abstract:

This study attempts to improve the accuracy of perturbation localization when generating countermeasure samples under a black-box attack in the Chinese field and to solve the problem of existing methods ignoring context relevance and semantic density when evaluating word importance. This study proposes a text adversarial sample generation method for Chinese language with multi-level perturbation localization (MDLM). First, a set of multi-level decision models is constructed integrating different feature extraction capabilities by organically combining multi-source heterogeneous deep learning models. Then, three new evaluation functions are added to evaluate the importance of words from multiple dimensions. Finally, the multi-level decision model and the evaluation function work together to accurately position the original text disturbance points. In terms of the text countermeasure sample generation strategy, MDLM integrates a variety of text replacement strategies, such as traditional Chinese characters, Pinyin, polyphonic words, and homonyms, aiming to ensure the success rate of attacks and improve the diversity of generated countermeasure samples. Experimental results show that when MDLM attacks multiple target models on multiple datasets, its disturbance effect is significant, and the maximum attack disturbance rate reaches 43.5%, which further enhances the attack ability against samples. Simultaneously, results of ablation experiments conducted to evaluate the multi-level perturbation localization ability show that the multi-level combination of the scoring function and decision model can significantly improve the attack effect of generating countermeasure samples.

Key words: black-box attack, perturbation localization, decision model, word importance assessment, adversarial sample generation