基于自然最近邻的联邦聚合算法

doi:10.19678/j.issn.1000-3428.0069131

摘要/Abstract

摘要：

联邦学习框架在保护本地数据隐私的同时，面临着来自攻击者污染客户端数据的挑战，导致全局模型性能下降。目前主流联邦学习框架通常假设客户端本地数据是干净的，但实际情况中攻击者可通过数据污染手段来降低模型的准确性。为此，提出一种基于自然最近邻的联邦聚合算法。与其他传统联邦防御算法不同，该算法为非独立同分布条件下的联邦学习框架，能够防御有目标的攻击。该算法引入自然最近邻的搜索过程，通过此过程赋予模型异常度，有效区分异常模型。选取其中异常度较小的节点参与训练，确保正常节点参与的训练次数远大于恶意节点次数。实验结果表明，在非独立同分布条件下，该算法在标签翻转和后门攻击等有目标攻击的场景下，能保持模型性能稳定，增强了联邦学习框架的鲁棒性。即使受到恶意攻击，该算法能够有效维护全局模型的性能和可靠性，为解决客户端数据污染问题提供了有效途径，为联邦学习框架安全性和稳定性提供新思路。

关键词: 联邦学习, 聚合算法, 自然最近邻, 鲁棒性, 标签翻转

Abstract:

The federated learning framework is hindered by data poisoning attacks from adversaries, causing performance degradation of the global model while preserving the privacy of local data. Currently, mainstream federated learning frameworks assume that client-side local data are clean; however, in reality, attackers can use data pollution strategies to degrade model accuracy. To address these issues, this study proposes a federated aggregation algorithm based on natural nearest neighbors. Unlike traditional federated defense algorithms, this algorithm is designed for federated learning frameworks under non-independent and identically distributed conditions and can defend against targeted attacks. The algorithm introduces search process for natural nearest neighbors, using which it assigns anomaly scores to models and distinguishes abnormal models effectively. Furthermore, the algorithm selects nodes with smaller anomaly scores to participate in training so that the number of normal nodes participating in the training far exceeds that of malicious nodes. Experimental results demonstrate that, under non-independent and identically distributed conditions, this algorithm maintains model accuracy in scenarios involving targeted attacks, such as label flipping and backdoor attacks, thereby enhancing the robustness of the federated learning framework. The performance and reliability of the global model is maintained despite encountering malicious attacks. The proposed algorithm addresses client data pollution issues effectively and offers new insights into the security and stability of federated learning frameworks.

Key words: federated learning, aggregation algorithm, natural nearest neighbors, robustness, label flipping

施永辉, 代琪, 陈丽芳, 韩阳. 基于自然最近邻的联邦聚合算法[J]. 计算机工程, 2025, 51(6): 236-244.

SHI Yonghui, DAI Qi, CHEN Lifang, HAN Yang. Federated Aggregation Algorithm Based on Natural Nearest Neighbors[J]. Computer Engineering, 2025, 51(6): 236-244.

https://www.ecice06.com/CN/Y2025/V51/I6/236

图/表 8

图1 基于自然最近邻的联邦学习聚合算法

Fig.1 Federated learning aggregation algorithm based on natural nearest neighbors

图2 基于自然最近邻的联邦学习聚合算法流程

Fig.2 Procedure of federated learning aggregation algorithm based on natural nearest neighbors

图3 联邦学习正常训练以及遭受攻击后的表现

Fig.3 Performance of federated learning for normal training and after being attacked

图4 在不同攻击者比例下的模型防御效果

Fig.4 Model defense effectiveness under different proportions of attackers

参考文献 31

1	陆泉, 张良韬. 处理流程视角下的大数据技术发展现状与趋势. 信息资源管理学报, 2017, 7(4): 17- 28. doi: 10.13365/j.jirm.2017.04.017
	LU Q, ZHANG L T. Evolutive status and trends of big data technology from the perspective of processing flow. Journal of Information Resource Management, 2017, 7(4): 17- 28. doi: 10.13365/j.jirm.2017.04.017
2	QI Q S, XU Z Y, RAIN P. Big data analytics challenges to implementing the intelligent Industrial Internet of Things (IIoT)systems in sustainable manufacturing operations. Technological Forecasting and Social Change, 2023, 190, 122401. doi: 10.1016/j.techfore.2023.122401
3	吴信东, 董丙冰, 堵新政, 等. 数据治理技术. 软件学报, 2019, 30(9): 2830- 2856.
	WU X D, DONG B B, DU X Z, et al. Data governance technology. Journal of Software, 2019, 30(9): 2830- 2856.
4	张思思, 高旭光, 滑文强. 基于聚类与人工神经网络的遥感图像信息提取方法. 电子设计工程, 2020, 28(15): 106- 109. doi: 10.14022/j.issn1674-6236.2020.15.024
	ZHANG S S, GAO X G, HUA W Q. Remote sensing image information extraction method based on clustering and artificial neural network. Electronic Design Engineering, 2020, 28(15): 106- 109. doi: 10.14022/j.issn1674-6236.2020.15.024
5	何雯, 白翰茹, 李超. 基于联邦学习的企业数据共享探讨. 信息与电脑, 2020, 32(8): 173- 176.
	HE W, BAI H R, LI C. Research of enterprise data sharing based on federated learning. Information and Computers, 2020, 32(8): 173- 176.
6	SUN Z T, KAIROUZ P, SURESH A T, et al. Can you really backdoor federated learning?[EB/OL]. [2023-05-29]. https://arxiv.org/abs/1911.07963v2. URL
7	ZHANG J L, CHEN J J, WU D, et al. Poisoning attack in federated learning using generative adversarial nets[C]//Proceedings of the 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering. Washington D. C., USA: IEEE Press, 2019: 12-19. URL
8	XIE C, HUANG K, CHEN P Y, et al. DBA: distributed backdoor attacks against federated learning[C]//Proceedings of International Conference on Learning Representations. [S. l. ]: AAAI Press, 2019: 101-106. URL
9	宋华伟, 李升起, 万方杰, 等. 非独立同分布场景下的联邦学习优化方法. 计算机工程, 2024, 50(3): 166- 172. doi: 10.19678/j.issn.1000-3428.0067791
	SONG H W, LI S Q, WAN F J, et al. Federated learning optimization method in Non-IID scenarios. Computer Engineering, 2024, 50(3): 166- 172. doi: 10.19678/j.issn.1000-3428.0067791
10	王永康, 翟弟华, 夏元清. 联邦学习中抵抗大量后门客户端的鲁棒聚合算法. 计算机学报, 2023, 46(6): 1302- 1314. doi: 10.11897/SP.J.1016.2023.01302
	WANG Y K, HUO D H, XIAO Y Q. A robust aggregation algorithm against a large group backdoor clients in federated learning system. Journal of Computer Science, 2023, 46(6): 1302- 1314. doi: 10.11897/SP.J.1016.2023.01302
11	郑昊, 许凯, 柏琪, 等. 基于梯度检测的联邦学习标签翻转攻击防御方法. 信息与电脑(理论版), 2023, 35(12): 105-107, 124.
	ZHENG H, XU K, BAI Q, et al. A gradient detection—based defence approach for federated learning label flipping attacks. Information and Computer (Theoretical Edition), 2023, 35(12): 105-107, 124.
12	BLANCHARD P, MHAMDI E M E, GUERRAOUI R, et al. Machine learning with adversaries: Byzantine tolerant gradient descent[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2017: 118-128. URL
13	YIN D, CHEN Y D, RAMCHANDRAN K, et al. Byzantine-robust distributed learning: towards optimal statistical rates[EB/OL]. [2023-05-29]. https://arxiv.org/pdf/1803.01498. URL
14	刘艺璇, 陈红, 刘宇涵, 等. 联邦学习中的隐私保护技术. 软件学报, 2022, 33(3): 1057- 1092.
	LIU Y X, CHEN H, LIU Y H, et al. Privacy-preserving techniques in federated learning. Journal of Software, 2022, 33(3): 1057- 1092.
15	XIE C, KOYEJO O, GUPTA I. Generalized Byzantine-tolerant SGD[EB/OL]. [2023-05-29]. https://arxiv.org/pdf/1802.10116. URL
16	BAGDASARYAN E, VEIT A, HUA Y Q, et al. How to backdoor federated learning[EB/OL]. [2023-05-29]. https://arxiv.org/pdf/1807.00459. URL
17	CID-FUENTES J A, SZABO C, FALKNER K. Adaptive performance anomaly detection in distributed systems using online SVMs. IEEE Transactions on Dependable and Secure Computing, 2018, 17(5): 928- 941. doi: 10.1109/TDSC.2018.2821693
18	SU S B, XIAO L M, RUAN L, et al. ADCMO: an anomaly detection approach based on local outlier factor for continuously monitored object[C]//Proceedings of 2019 IEEE International Conference on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). Washington D. C., USA: IEEE Press, 2019: 865-870. URL
19	CAO D, CHANG S, LIN Z J, et al. Understanding distributed poisoning attack in federated learning[C]// Proceedings of 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS). Washington D. C., USA: IEEE Press, 2019: 233-239. URL
20	ZHAO Y, CHEN J J, ZHANG J L, et al. Detecting and mitigating poisoning attacks in federated learning using generative adversarial networks. Concurrency and Computation: Practice and Experience, 2022, 34(7): e5906. doi: 10.1002/cpe.5906
21	CHEN C L, GOLUBCHIK L, PAOLIERI M. Backdoor attacks on federated meta-learning[EB/OL]. [2023-05-29]. https://arxiv.org/abs/2006.07026.
22	SHEJWALKAR V, HOUMANSADR A. Manipulating the Byzantine: optimizing model poisoning attacks and defenses for federated learning[C]//Proceedings of Conference on Network and Distributed System Security Symposium. [S. l. ]: Internet Society, 2021: 135-141. URL
23	PILLUTLA K, KAKADE S M, HARCHAOUI Z. Robust aggregation for federated learning. IEEE Transactions on Signal Processing, 2022, 70, 1142- 1154. doi: 10.1109/TSP.2022.3153135
24	MUÑOZ-GONZÁLEZ L, CO K T, LUPU E C. Byzantine-robust federated machine learning through adaptive model averaging[EB/OL]. [2023-05-29]. https://arxiv.org/abs/1909.05125?context=cs.DC.
25	XIE C, KOYEJO S, GUPTA I. Zeno: distributed stochastic gradient descent with suspicion-based faulttoler-ance[C]//Proceedings of International Conference on Machine Learning. [S. l. ]: AAAI Press, 2019: 689. URL
26	FUNG C, YOON C J M, BESCHASTNIKH I. The limitations of federated learning in sybil settings[C]//Proceedings of the 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020). Berkeley, USA: USENIX Association, 2020: 301-316.
27	MCMAHAN B H, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decen-tralized data[EB/OL]. [2023-05-29]. https://arxiv.org/abs/1602.05629v3.
28	邹咸林. 自然最近邻居在高维数据结构学习中的应用[D]. 重庆: 重庆大学, 2011.
	ZOU X L. Learning structure features in high-dimensional data based on natural nearest neighbor[D]. Chongqing: Chongqing University, 2011. (in Chinese)
29	SCHUBERT E, ZIMEK A, KRIEGEL H P. Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data mining and knowledge discovery, 2014, 28(1): 190- 237. doi: 10.1007/s10618-012-0300-z
30	ANTON S D D, SINHA S, SCHOTTEN H D. Anomaly-based intrusion detection in industrial data with SVM and random forests[C]//Proceedings of 2019 International Conference on software, Telecommunications and Computer Networks (SoftCOM). Washington D. C., USA: IEEE Press, 2019: 1-6. URL
31	HUANG S, BAI Y, WANG Z, et al. Defending against poisoning attack in federated learning using isolated forest[C]//Proceedings of the 2nd International Conference on Computer, Control and Robotics (ICCCR). Washington D. C., USA: IEEE Press, 2022: 224-229. URL

[1]	姚玉鹏, 魏立斐, 张蕾. 一种隐私保护的抗投毒攻击联邦学习方案[J]. 计算机工程, 2025, 51(6): 223-235.
[2]	沈忱, 何勇, 彭安浪. 鲁棒物联网多维时序数据预测方法[J]. 计算机工程, 2025, 51(4): 107-118.
[3]	赵宏, 宋馥荣, 李文改. 基于SE-AdvGAN的图像对抗样本生成方法研究[J]. 计算机工程, 2025, 51(2): 300-311.
[4]	吴小红, 李佩, 顾永跟, 陶杰. 基于EMD最优匹配的分层联邦学习算法[J]. 计算机工程, 2025, 51(2): 170-178.
[5]	吴若岚, 陈玉玲, 豆慧, 张洋文, 龙钟. 抗攻击的联邦学习隐私保护算法[J]. 计算机工程, 2025, 51(2): 179-187.
[6]	王圆圆, 王世谦, 王涵, 郭正宾, 胡显承. 基于纵向联邦学习的能源排放跨界智能分析[J]. 计算机工程, 2025, 51(1): 164-173.
[7]	陈先意, 丁思哲, 王康, 闫雷鸣, 付章杰. 一种支持安全联邦学习的主动保护模型水印框架[J]. 计算机工程, 2025, 51(1): 138-147.
[8]	喻勇涛, 孙奥, 李昂, 朱琳琳. 基于孪生网络的分类器输出重复性优化方法[J]. 计算机工程, 2025, 51(1): 118-127.
[9]	郑秋梅, 赵丹, 牛薇薇, 林超. 基于多通道的彩色图像多重水印算法[J]. 计算机工程, 2024, 50(9): 246-254.
[10]	潘恩元, 钟原, 李平. 联邦异质性数据下半监督颈椎MRI分割模型[J]. 计算机工程, 2024, 50(9): 367-376.
[11]	李维刚, 厉许昌, 田志强, 李金灵. 基于自蒸馏框架的点云分类及其鲁棒性研究[J]. 计算机工程, 2024, 50(9): 72-81.
[12]	李红娇, 王宝金, 王朝晖, 胡仁豪. 基于模型相似度与本地损失的双重客户端选择算法[J]. 计算机工程, 2024, 50(8): 153-164.
[13]	顾永跟, 高凌轩, 吴小红, 陶杰. 非独立同分布下联邦半监督学习的数据分享研究[J]. 计算机工程, 2024, 50(6): 188-196.
[14]	熊世强, 何道敬, 王振东, 杜润萌. 联邦学习及其安全与隐私保护研究综述[J]. 计算机工程, 2024, 50(5): 1-15.
[15]	顾永跟, 李国笑, 吴小红, 陶杰, 张艳琼. 预算约束下多任务联邦学习激励机制[J]. 计算机工程, 2024, 50(5): 149-157.

选择文件类型/文献管理软件名称

选择包含的内容