作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (2): 179-187. doi: 10.19678/j.issn.1000-3428.0068705

• 网络空间安全 • 上一篇    下一篇

抗攻击的联邦学习隐私保护算法

吴若岚, 陈玉玲*(), 豆慧, 张洋文, 龙钟   

  1. 贵州大学省部共建公共大数据国家重点实验室计算机科学与技术学院, 贵州 贵阳 550025
  • 收稿日期:2023-10-30 出版日期:2025-02-15 发布日期:2025-02-28
  • 通讯作者: 陈玉玲
  • 基金资助:
    国家自然科学基金(62202118); 贵州省教育厅“揭榜挂帅”科技攻关项目(黔教技[2023]003号); 贵州省科技厅百层次创新人才项目(黔科合平台人才-GCC[2023]018); 贵州省教育厅自然科学研究科技拔尖人才项目(黔教技[2022]073号)

Privacy Preserving Algorithm Using Federated Learning Against Attacks

WU Ruolan, CHEN Yuling*(), DOU Hui, ZHANG Yangwen, LONG Zhong   

  1. College of Computer Science and Technology, State Key Laboratory of Public Big Data Co-built by Provincial and Ministry, Guizhou University, Guiyang 550025, Guizhou, China
  • Received:2023-10-30 Online:2025-02-15 Published:2025-02-28
  • Contact: CHEN Yuling

摘要:

联邦学习作为新兴的分布式学习框架, 允许多个客户端在不共享原始数据的情况下共同进行全局模型的训练, 从而有效保护了数据隐私。然而, 传统联邦学习仍然存在潜在的安全隐患, 容易受到中毒攻击和推理攻击的威胁。因此, 为了提高联邦学习的安全性和模型性能, 需要准确地识别恶意客户端的行为, 同时采用梯度加噪的方法来避免攻击者通过监控梯度信息来获取客户端的数据。结合恶意客户端检测机制和本地差分隐私技术提出了一种鲁棒的联邦学习框架。该算法首先利用梯度相似性来判断和识别潜在的恶意客户端, 减小对模型训练任务产生的不良影响; 其次, 根据不同查询的敏感性以及用户的个体隐私需求, 设计一种基于动态隐私预算的本地差分隐私算法, 旨在平衡隐私保护和数据质量之间的权衡。在MNIST、CIFAR-10和MR文本分类数据集上的实验结果表明, 与3种基准算法相比, 该算法在准确性方面针对sP类客户端平均提高了3百分点, 实现了联邦学习中更高的安全性水平, 显著提升了模型性能。

关键词: 联邦学习, 中毒攻击, 推理攻击, 本地差分隐私, 隐私保护

Abstract:

Federated learning is an emerging distributed learning framework that facilitates the collective engagement of multiple clients in global model training without sharing raw data, thereby effectively safeguarding data privacy. However, traditional federated learning still harbors latent security vulnerabilities that are susceptible to poisoning and inference attacks. Therefore, enhancing the security and model performance of federated learning has become imperative for precisely identifying malicious client behavior by employing gradient noise as a countermeasure to prevent attackers from gaining access to client data through gradient monitoring. This study proposes a robust federated learning framework that combines mechanisms for malicious client detection with Local Differential Privacy (LDP) techniques. The algorithm initially employs gradient similarity to identify and classify potentially malicious clients, thereby minimizing their adverse impact on model training tasks. Subsequently, a dynamic privacy budget based on LDP is designed, to accommodate the sensitivity of different queries and individual privacy requirements, with the objective of achieving a balance between privacy preservation and data quality. Experimental results on the MNIST, CIFAR-10, and Movie Reviews (MR) text classification datasets demonstrate that compared to the three baseline algorithms, this algorithm results in an average 3 percentage points increase in accuracy for sP-type clients, thereby achieving a higher security level with significantly enhanced model performance within the federated learning framework.

Key words: federated learning, poisoning attack, inference attack, Local Differential Privacy (LDP), privacy protection