作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (6): 236-244. doi: 10.19678/j.issn.1000-3428.0069131

• 网络空间安全 • 上一篇    下一篇

基于自然最近邻的联邦聚合算法

施永辉1, 代琪1, 陈丽芳1,2,*(), 韩阳1,3   

  1. 1. 华北理工大学理学院,河北 唐山 063210
    2. 河北省数据科学与应用重点实验室,河北 唐山 063210
    3. 华北理工大学学科建设处,河北 唐山 063210
  • 收稿日期:2023-12-29 出版日期:2025-06-15 发布日期:2024-06-21
  • 通讯作者: 陈丽芳
  • 基金资助:
    国家自然科学基金面上项目(52074126)

Federated Aggregation Algorithm Based on Natural Nearest Neighbors

SHI Yonghui1, DAI Qi1, CHEN Lifang1,2,*(), HAN Yang1,3   

  1. 1. College of Science, North China University of Technology, Tangshan 063210, Hebei, China
    2. Hebei Provincial Key Laboratory of Data Science and Application, Tangshan 063210, Hebei, China
    3. Department of Discipline Construction, North China University of Technology, Tangshan 063210, Hebei, China
  • Received:2023-12-29 Online:2025-06-15 Published:2024-06-21
  • Contact: CHEN Lifang

摘要:

联邦学习框架在保护本地数据隐私的同时,面临着来自攻击者污染客户端数据的挑战,导致全局模型性能下降。目前主流联邦学习框架通常假设客户端本地数据是干净的,但实际情况中攻击者可通过数据污染手段来降低模型的准确性。为此,提出一种基于自然最近邻的联邦聚合算法。与其他传统联邦防御算法不同,该算法为非独立同分布条件下的联邦学习框架,能够防御有目标的攻击。该算法引入自然最近邻的搜索过程,通过此过程赋予模型异常度,有效区分异常模型。选取其中异常度较小的节点参与训练,确保正常节点参与的训练次数远大于恶意节点次数。实验结果表明,在非独立同分布条件下,该算法在标签翻转和后门攻击等有目标攻击的场景下,能保持模型性能稳定,增强了联邦学习框架的鲁棒性。即使受到恶意攻击,该算法能够有效维护全局模型的性能和可靠性,为解决客户端数据污染问题提供了有效途径,为联邦学习框架安全性和稳定性提供新思路。

关键词: 联邦学习, 聚合算法, 自然最近邻, 鲁棒性, 标签翻转

Abstract:

The federated learning framework is hindered by data poisoning attacks from adversaries, causing performance degradation of the global model while preserving the privacy of local data. Currently, mainstream federated learning frameworks assume that client-side local data are clean; however, in reality, attackers can use data pollution strategies to degrade model accuracy. To address these issues, this study proposes a federated aggregation algorithm based on natural nearest neighbors. Unlike traditional federated defense algorithms, this algorithm is designed for federated learning frameworks under non-independent and identically distributed conditions and can defend against targeted attacks. The algorithm introduces search process for natural nearest neighbors, using which it assigns anomaly scores to models and distinguishes abnormal models effectively. Furthermore, the algorithm selects nodes with smaller anomaly scores to participate in training so that the number of normal nodes participating in the training far exceeds that of malicious nodes. Experimental results demonstrate that, under non-independent and identically distributed conditions, this algorithm maintains model accuracy in scenarios involving targeted attacks, such as label flipping and backdoor attacks, thereby enhancing the robustness of the federated learning framework. The performance and reliability of the global model is maintained despite encountering malicious attacks. The proposed algorithm addresses client data pollution issues effectively and offers new insights into the security and stability of federated learning frameworks.

Key words: federated learning, aggregation algorithm, natural nearest neighbors, robustness, label flipping