Chinese Adversarial Examples Generation Based on Adaptive Beam Search Algorithm

doi:10.19678/j.issn.1000-3428.0069348

Abstract

Abstract:

Deep Neural Network (DNN) are extremely vulnerable to adversarial examples, where subtle perturbations to legitimate inputs may cause the model to yield erroneous outputs. Exploring adversarial attacks can promote the robustness of deep learning models and advance the interpretability of DNN. Existing methods for generating Chinese adversarial examples typically employ simple transformation strategies, with emphasis on isolated Chinese linguistic features without considering the contextual effect of attacks. Hence, a heuristic-based algorithm known as the BSCA is proposed in this study. By comprehensively analyzing the linguistic variations and incorporating prior knowledge of Chinese character formation, phonics, and formality, a strategy for accurately assessing Chinese character deviations is designed. The adversarial search space is constructed based on this deviation strategy, and an improved beam search algorithm is utilized to optimize the generation process of Chinese adversarial examples in black-box attacks. Under strict constraints on perturbance and semantic deviation, BSCA can automatically adapt to different scenario requirements. Experimental evaluations conducted on TextCNN, TextRNN, and Bidirectional Encoder Representations from Transformers (BERT) for two Natural Language Processing (NLP) tasks indicate that BSCA can reduce the classification accuracy by at least 63.84 percentage points while incurring lower attack costs compared with baseline methods.

Key words: adversarial examples, Chinese feature, black-box attack, beam search, text classification

摘要：

深度神经网络(DNN)极易受到对抗样本的影响, 仅需向原始文本中添加细微的扰动即可诱导目标模型做出误判。研究对抗样本的生成不仅有利于提升模型的鲁棒性, 还能推动DNN可解释性方面的工作。在中文对抗领域, 现有的中文对抗样本生成方法大多采用单一变换策略, 仅考虑了部分汉语特征, 并且忽视了攻击对上下文语境产生的影响。为了解决这些问题, 提出一种基于启发式算法的中文对抗样本生成方法BSCA。通过全面分析表音文字和意音文字之间的差异, 结合汉语的构字法、字音、字形、认知语言学等先验知识, 设计可准确评估汉字差异的中文文本扰动策略。利用扰动策略构建对抗搜索空间, 并运用改进的集束搜索算法对黑盒攻击过程进行优化。在严格限制扰动大小和语义偏移的情况下, BSCA能够自动选择不同的攻击策略, 以适应不同场景需求。在多个自然语言处理(NLP)任务上分别对TextCNN、TextRNN和BERT(Bidirectional Encoder Representations from Transformers)模型进行实验, 结果表明, BSCA具有较好的泛化能力, 能使分类准确率至少降低63.84百分点, 同时拥有比基线方法更低的攻击代价。

关键词: 对抗样本, 中文特征, 黑盒攻击, 集束搜索, 文本分类

XIA Niming, ZHANG Jie. Chinese Adversarial Examples Generation Based on Adaptive Beam Search Algorithm[J]. Computer Engineering, 2025, 51(8): 131-140.

夏倪明, 张洁. 基于自适应集束搜索算法的中文对抗样本生成[J]. 计算机工程, 2025, 51(8): 131-140.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0069348

https://www.ecice06.com/EN/Y2025/V51/I8/131

Figures/Tables 13

Fig.1 BSCA system architecture

Fig.2 Performance of attacks on BERT in THUCNews dataset

Fig.3 Performance of classification accuracy decreasing with semantic constraint threshold λ (takeout review dataset)

Fig.4 Performance of classification accuracy decreasing with semantic constraint threshold λ (hotel review dataset)

Fig.5 Performance of classification accuracy decreasing with semantic constraint threshold λ (online shopping review dataset)

References 28

1	XU H , MA Y , LIU H C , et al. Adversarial attacks and defenses in images, graphs and text: a review. International Journal of Automation and Computing, 2020, 17 (2): 151- 178. doi: 10.1007/s11633-019-1211-x
2	KONG Z X , XUE J F , WANG Y , et al. A survey on adversarial attack in the age of artificial intelligence. Wireless Communications and Mobile Computing, 2021 (1): 4907754. URL
3	GOYAL S , DODDAPANENI S , KHAPRA M M , et al. A survey of adversarial defenses and robustness in NLP. ACM Computing Surveys, 2023, 55 (14s): 1- 39. URL
4	HAN X , ZHANG Y , WANG W , et al. Text adversarial attacks and defenses: issues, taxonomy, and perspectives. Security and Communication Networks, 2022, 2022, 6458488.
5	SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[EB/OL]. [2024-01-16]. https://arxiv.org/abs/1312.6199v4.
6	ILYAS A, SANTURKAR S, TSIPRAS D, et al. Adversarial examples are not bugs, they are features[EB/OL]. [2024-01-16]. https://arxiv.org/abs/1905.02175v4.
7	BELINKOV Y, BISK Y. Synthetic and natural noise both break neural machine translation[EB/OL]. [2024-01-16]. https://arxiv.org/abs/1711.02173v2.
8	GAO J, LANCHANTIN J, SOFFA M L, et al. Black-box generation of adversarial text sequences to evade deep learning classifiers[C]//Proceedings of the IEEE Security and Privacy Workshops (SPW). San Francisco, USA: IEEE Press, 2018: 50-56.
9	JIN D , JIN Z J , ZHOU J T , et al. Is BERT really robust? A strong baseline for natural language attack on text classification and entailment. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34 (5): 8018- 8025. doi: 10.1609/aaai.v34i05.6311
10	REN S H, DENG Y H, HE K, et al. Generating natural language adversarial examples through probability weighted word saliency[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2019: 1085-1097.
11	LI J F, JI S L, DU T Y, et al. TextBugger: generating adversarial text against real-world applications[C]//Proceedings of 2019 Network and Distributed System Security Symposium. San Diego, USA: Internet Society, 2019: 1-15.
12	ALZANTOT M, SHARMA Y, ELGOHARY A, et al. Generating natural language adversarial examples[EB/OL]. [2024-01-16]. https://arxiv.org/abs/1804.07998v2.
13	ZANG Y, QI F C, YANG C H, et al. Word-level textual adversarial attacking as combinatorial optimization[EB/OL]. [2024-01-16]. https://arxiv.org/abs/1910.12196v4.
14	王文琦, 汪润, 王丽娜, 等. 面向中文文本倾向性分类的对抗样本生成方法. 软件学报, 2019, 30 (8): 2415- 2427.
	WANG W Q , WANG R , WANG L N , et al. Adversarial examples generation approach for tendency classification on Chinese texts. Journal of Software, 2019, 30 (8): 2415- 2427.
15	张顺香, 吴厚月, 朱广丽, 等. 面向中文文本分类的字符级对抗样本生成方法. 电子与信息学报, 2023, 45 (6): 2226- 2235.
	ZHANG S X , WU H Y , ZHU G L , et al. Character-level adversarial samples generation approach for Chinese text classification. Journal of Electronics & Information Technology, 2023, 45 (6): 2226- 2235.
16	李相葛, 罗红, 孙岩. 基于汉语特征的中文对抗样本生成方法. 软件学报, 2023, 34 (11): 5143- 5161.
	LI X G , LUO H , SUN Y . Adversarial sample generation method based on chinese features. Journal of Software, 2023, 34 (11): 5143- 5161.
17	LIU M X , ZHANG Z H , ZHANG Y M , et al. Automatic generation of adversarial readable Chinese texts. IEEE Transactions on Dependable and Secure Computing, 2023, 20 (2): 1756- 1770. doi: 10.1109/TDSC.2022.3164289
18	韩子屹, 王巍, 玄世昌. 多约束引导的中文对抗样本生成. 中文信息学报, 2023, 37 (2): 41- 52.
	HAN Z Y , WANG W , XUAN S C . Multi-constraint guided Chinese adversarial examples generation. Journal of Chinese Information Processing, 2023, 37 (2): 41- 52.
19	JIN R, WU C H. WordErrorSim: an adversarial examples generation method in Chinese by erroneous knowledge[C]//Proceedings of the 5th International Conference on Compute and Data Analysis. New York, USA: ACM Press, 2021: 155-161.
20	王春东, 孙嘉琪, 杨文军. 基于矫正理解的中文文本对抗样本生成方法. 计算机工程, 2023, 49 (2): 37- 45. doi: 10.19678/j.issn.1000-3428.0065762
	WANG C D , SUN J Q , YANG W J . Method for generating Chinese text adversarial examples based on rectification understanding. Computer Engineering, 2023, 49 (2): 37- 45. doi: 10.19678/j.issn.1000-3428.0065762
21	陈鸣, 杜庆治, 邵玉斌, 等. 基于音形码的汉字相似度比对算法. 信息技术, 2018, 42 (11): 73- 75.
	CHEN M , DU Q Z , SHAO Y B , et al. Chinese characters similarity comparison algorithm based on phonetic code and shape code. Information Technology, 2018, 42 (11): 73- 75.
22	周昊, 沈庆宏. 基于改进音形码的中文敏感词检测算法. 南京大学学报(自然科学), 2020, 56 (2): 270- 277.
	ZHOU H , SHEN Q H . Chinese sensitive words detection algorithm based on improved sound-character code. Journal of Nanjing University (Natural Science), 2020, 56 (2): 270- 277.
23	黄大方. 四角号码法及其优化. 汕头大学学报(人文社会科学版), 1994 (2): 1- 11.
	HUANG D F . Four-corner method and its optimization. Journal of Shantou University (Humanities Edition), 1994 (2): 1- 11.
24	YAN M , RICHTER E M , SHU H , et al. Readers of Chinese extract semantic information from parafoveal words. Psychonomic Bulletin & Review, 2009, 16 (3): 561- 566. URL
25	CER D, YANG Y F, KONG S Y, et al. Universal sentence encoder[EB/OL]. [2024-01-16]. https://arxiv.org/abs/1803.11175v2.
26	SUN M S, CHEN X X, ZHANG K X, et al. THULAC: an efficient lexical analyzer for Chinese[EB/OL]. [2024-01-16]. https://gitcode.com/gh_mirrors/th/THULAC.
27	仝鑫, 王罗娜, 王润正, 等. 面向中文文本分类的词级对抗样本生成方法. 信息网络安全, 2020, 20 (9): 12- 16.
	TONG X , WANG L N , WANG R Z , et al. A generation method of word-level adversarial samples for Chinese text classification. Netinfo Security, 2020, 20 (9): 12- 16.
28	张云婷, 叶麟, 唐浩林, 等. 基于掩码语言模型的中文BERT攻击方法. 软件学报, 2024, 35 (7): 3392- 3409.
	ZHANG Y T , YE L , TANG H L , et al. Chinese BERT attack method based on masked language model. Journal of Software, 2024, 35 (7): 3392- 3409.

[1]	HOU Yan, CHE Lei, LI Hui. Text Adversarial Sample Generation Method for Chinese Language with Multi-level Perturbation Localization [J]. Computer Engineering, 2025, 51(7): 232-243.
[2]	ZHENG Cheng, LI Pengfei. Text Classification Based on Feature Fusion of Dual Hypergraph Neural Networks [J]. Computer Engineering, 2025, 51(6): 127-135.
[3]	Lai QIAN, Weiwei ZHAO. Text Classification Method Based on Contrastive Learning and Attention Mechanism [J]. Computer Engineering, 2024, 50(7): 104-111.
[4]	YOU Ben, LI Xiaohong, YAO Jin, FENG Shaojie. Semi-supervised Classification for Short Text Based on Multi-grained Graphs and Attention Mechanism [J]. Computer Engineering, 2024, 50(5): 83-90.
[5]	Qian LI, Haiyun XIANG, Yuting ZHANG, Yun GAN, Haode LIAO. G-MASK Facial Adversarial Attack Combining Gaussian Filtering and MASK [J]. Computer Engineering, 2024, 50(2): 308-316.
[6]	Zheming LI, Jindong WANG, Jianzhong HOU, Wei LI, Shihua ZHANG, Hengwei ZHANG. Adversarial Example Attack Method Based on Salient Region Optimization [J]. Computer Engineering, 2023, 49(9): 246-255, 264.
[7]	ZHANG Boxu, PU Zhi, CHENG Xi. Research on Uyghur Text Classification Based on Prompt Learning [J]. Computer Engineering, 2023, 49(6): 292-299,313.
[8]	BAI Zhixu, WANG Hengjun. Adversarial Example Generation Method Based on Improved Genetic Algorithm [J]. Computer Engineering, 2023, 49(5): 139-149.
[9]	WANG Chundong, SUN Jiaqi, YANG Wenjun. Method for Generating Chinese Text Adversarial Examples Based on Rectification Understanding [J]. Computer Engineering, 2023, 49(2): 37-45.
[10]	Feiyu WANG, Fan ZHANG, Jiayu DU, Hongle LEI, Xiaofeng QI. Adversarial Examples Detection Method Based on Image Denoising and Compression [J]. Computer Engineering, 2023, 49(10): 230-238.
[11]	CHEN Kejia, LIU Hui. Chinese Text Classification Method Based on Improved BiGRU-CNN [J]. Computer Engineering, 2022, 48(5): 59-66,73.
[12]	YANG Wenxue, WU Fei, GUO Tong, XIAO Limin. Adversarial Sample Defense Method Based on Noise Dissolution [J]. Computer Engineering, 2022, 48(4): 158-164.
[13]	JIN Yucheng, WANG Qingqin, GAO Jian, MIAO Zhongchen, LIN Yuefeng, XIANG Yali, XIONG Yun. Multi-label Financial Text Classification Algorithm Based on Graph Deep Learning [J]. Computer Engineering, 2022, 48(4): 16-21.
[14]	LI Ranran, LIU Daming, LIU Zheng, CHANG Gaoxiang. Text Classification Using Capsule Network Integrating Stroke Features [J]. Computer Engineering, 2022, 48(3): 69-73,80.
[15]	LI Zheming, ZHANG Hengwei, MA Junqiang, WANG Jindong, YANG Bo. Adversarial Examples Generation Method Based on Random Translation Transformation [J]. Computer Engineering, 2022, 48(11): 152-160,183.

Please choose a citation manager

Content to export