作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2021, Vol. 47 ›› Issue (6): 132-141. doi: 10.19678/j.issn.1000-3428.0058239

• 网络空间安全 • 上一篇    下一篇

端到端说话人辨认的对抗样本应用比较研究

廖俊帆1, 顾益军1, 张培晶2, 廖茜1   

  1. 1. 中国人民公安大学 信息网络安全学院, 北京 102600;
    2. 中国人民公安大学 网络信息中心, 北京 100038
  • 收稿日期:2020-05-03 修回日期:2020-06-17 发布日期:2020-06-28
  • 作者简介:廖俊帆(1995-),男,硕士研究生,主研方向为对抗样本攻击与防御;顾益军(通信作者),教授、博士;张培晶,副研究员、硕士;廖茜,硕士研究生。

Comparative Research on Application of Adversarial Samples for End-to-End Speaker Identification

LIAO Junfan1, GU Yijun1, ZHANG Peijing2, LIAO Qian1   

  1. 1. College of Information Network Security, People's Public Security University of China, Beijing 102600, China;
    2. Network Information Center, People's Public Security University of China, Beijing 100038, China
  • Received:2020-05-03 Revised:2020-06-17 Published:2020-06-28
  • Contact: 公安部技术研究计划竞争性遴选项目(2019JZX009);中国人民公安大学公共安全行为科学研究与技术创新专项。 E-mail:754605668@qq.com

摘要: 为探究对抗样本对端到端说话人辨认系统的安全威胁与攻击效果,比较现有对抗样本生成算法在语音环境下的性能优劣势,分析FGSM、JSMA、BIM、C&W、PGD 5种白盒算法和ZOO、HSJA 2种黑盒算法。将7种对抗样本生成算法在ResCNN和GRU两种网络结构的端到端说话人辨认模型中实现有目标和无目标攻击,并制作音频对抗样本,通过攻击成功率和信噪比等性能指标评估攻击效果并进行人工隐蔽性测试。实验结果表明,现有对抗样本生成算法可在端到端说话人辨认模型中进行实现,白盒算法中的BIM、PGD具有较好的性能表现,黑盒算法的无目标攻击能达到白盒算法的攻击效果,但其有目标攻击性能有待进一步提升。

关键词: 说话人辨认, 对抗样本, 鲁棒性, 对抗攻击, 信噪比

Abstract: In order to explore the security threats and attack effects of the adversarial samples on the end-to-end speaker identification system, this paper analyzes five white box algorithms(FGSM, JSMA, BIM, C&W, PGD) and two black box algorithms(ZOO, HSJA) to compare the advantages and disadvantages of the existing adversarial sample generation algorithms in a phonetic context.Each generation algorithm implements targeted and non-targeted attacks in the end-to-end speaker identification model of ResCNN and GRU, and creates effective audio adversarial samples.Then the attack effects are evaluated by using the performance indicators such as Attack Success Rate(ASR) and Signal to Noise Ratio(SNR).Finally, a manual concealment test is performed.Experimental results show that the existing adversarial sample generation algorithms can be implemented in the end-to-end speaker identification model.The BIM and PGD in the white box generation algorithm have excellent performance.The black box generation algorithm gets non-targeted attacks that are on par with that of the white box generation algorithm, while its targeted attack effect still needs improvement.

Key words: speaker identification, adversarial sample, robustness, adversarial attack, Signal to Noise Ratio(SNR)

中图分类号: