Clustering-Based Decentralized Federated Unlearning Method

doi:10.19678/j.issn.1000-3428.0252244

Abstract

Abstract: Data privacy regulations mandate that machine learning models possess the capability to securely remove user data, enabling efficient responses to unlearning requests. However, in the context of Decentralized Federated Learning (DFL), unlearning is challenged by the high retraining costs resulting from multi-round parameter propagation, alongside client heterogeneity in computational power, communication bandwidth, and non-independent and identically distributed (Non-IID) data characteristics, which significantly impede model convergence. Existing research predominantly focuses on centralized federated learning scenarios, with limited exploration of unlearning mechanisms in DFL. To address this, this paper proposes a clustering-based decentralized federated unlearning method (CDFU), enhancing unlearning efficiency through client clustering and optimized topology learning. CDFU employs the proposed Hierarchical Clustering algorithm (HCK), which performs coarse and fine-grained clustering based on computational power, communication bandwidth, and data distribution features, forming time-synchronized and data-balanced clusters. This confines retraining to affected clusters, thereby reducing computational and communication overheads. Following clustering, CDFU utilizes a latency-optimized Topology Learning algorithm (LOTL) to construct and refine the intra-cluster communication topology, minimizing communication delays and accelerating model convergence. Experimental results demonstrate that CDFU significantly improves training efficiency across multiple datasets, achieving an average increase of 10.2% in test accuracy compared to existing methods.

摘要： 数据隐私法规要求机器学习模型具备安全移除用户数据的能力，以高效响应遗忘请求。然而，在去中心化联邦学习(DFL)环境中，遗忘学习面临多轮参数传播导致的重训练成本高昂问题，同时客户端计算能力、通信带宽异构性及非独立同分布(Non-IID)特性显著阻碍模型收敛。现有研究多集中于中心化联邦学习场景，对去中心化联邦遗忘机制的探索不足。为此，本文提出了一种基于聚类的去中心化联邦遗忘方法(CDFU)，通过客户端聚类和高效的拓扑学习提升遗忘效率。CDFU采用本文设计的分层聚类算法(HCK)，根据计算能力、通信带宽和数据分布特征进行粗聚类与细粒度划分，形成时间同步且数据分布均衡的集群，将遗忘重训练限定于受影响集群，从而降低计算和通信开销。划分完成后，CDFU通过时延优化的拓扑学习算法(LOTL)构建并优化集群内通信拓扑，减少通信延迟，加速模型收敛。实验表明，CDFU在多个数据集上显著提升训练效率，测试准确率平均提高10.2%，优于现有方法。

Zhiqiang Xiao, Juncheng Jia. Clustering-Based Decentralized Federated Unlearning Method[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0252244.

肖志强, 贾俊铖. 基于聚类的去中心化联邦遗忘方法[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0252244.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0252244

References

[1] Y. Sun, M. Peng, Y. Zhou, Y. Huang, and S. Mao, “Application of machine learning in wireless networks: Key techniques and open issues,” IEEE Communications Surveys & Tutorials, vol. 21, no. 4, pp. 3072–3108, 2019. [2] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial intelligence and statistics, PMLR, 2017, pp. 1273–1282. [3] 熊世强, 何道敬, 王振东, 杜润萌. 联邦学习及其安全与隐私保护研究综述[J]. 计算机工程, 2024, 50(5): 1-15. XIONG Shiqiang, HE Daojing, WANG Zhendong, DU Runmeng. Review of Federated Learning and Its Security and Privacy Protection[J]. Computer Engineering, 2024, 50(5): 1-15. [4] A. Lalitha, S. Shekhar, T. Javidi, and F. Koushanfar, “Fully decentralized federated learning,” in Third workshop on bayesian deep learning (NeurIPS), 2018. [5] S. Augenstein, A. Hard, K. Partridge, and R. Mathews, “Jointly learning from decentralized (federated) and centralized data to mitigate distribution shift,” arXiv preprint arXiv:2111.12150, 2021. [6] 刘炜,马杰,夏玉洁,等.一种基于区块链和梯度压缩的去中心化联邦学习模型[J].郑州大学学报（理学版）, 2024, 56(5):47-54. Learning Model Based on Blockchain and Gradient Compression. Journal of Zhengzhou University (Natural Science Edition), 2024, 56(5):47-54. [7] 周炜,王超,徐剑,等.基于区块链的隐私保护去中心化联邦学习模型[J].计算机研究与发展, 2022, 59(11):2423-2436. Wei Zhou, Chao Wang, Jian Xu, et al. Privacy-Preserving and Decentralized Federated Learning Model Based on the Blockchain. Journal of Computer Research and Development, 2022, 59(11): 2423–2436. [8] P. Voigt and A. Von dem Bussche, The EU general data protection regulation (GDPR). A practical guide. Springer International Publishing, 2017. [9] S. of C. D. of J. Office of the Attorney General, “California consumer privacy act (CCPA).” https://oag.ca.gov/privacy/ccpa, 2023. [10] R. N. Daniel Hounslow, “Japan - data protection ov-erview (JDPO).” https://www.dataguidance.com/notes/japan-dataprotection-overview, 2019. [11] G. of Canada, “Consumer privacy protection act.” 2 022. Available: https://ised-isde.canada.ca/site/innovation-be tter-canada/en/consumer-privacy-protection-act [12] Y. Cao and J. Yang, “Towards making systems forget with machine unlearning,” in 2015 IEEE symposium on security and privacy, IEEE, 2015, pp. 463–480. [13] 王鹏飞, 魏宗正, 周东生, 等. 联邦忘却学习研究综述 [J]. 计算机学报, 2024, 47(2): 396-422. WANG Pengfei, WEI Zongzheng, ZHOU Dongsheng, et al. A Survey on Federated Unlearning[J]. Chinese Journal of Computers, 2024, 47(2): 396-422. [14] Zuo X, Wang M, Zhu T, et al. Federated learning with blockchain-enhanced machine unlearning: A trustworthy approach[J]. IEEE Transactions on Services Computing, 2025. [15] G. Liu, X. Ma, Y. Yang, C. Wang, and J. Liu, “Federaser: Enabling efficient client-level data removal from federated learning models,” in 2021 IEEE/ACM 29th international symposium on quality of service (IWQOS), IEEE, 2021, pp. 1 10. [16] J. Wang, S. Guo, X. Xie, and H. Qi, “Federated unlearning via class-discriminative pruning,” in Proceedings of the ACM web conference 2022, 2022, pp. 622–632. [17] T. Baumhauer, P. Schöttle, and M. Zeppelzauer, “Machine unlearning: Linear filtration for logit-based classifiers,” Machine Learning, vol. 111, no. 9, pp. 3203–3226, 2022. [18] Z. Wang et al., “FedCSA: Boosting the convergence speed of federated unlearning under data heterogeneity,” in 2023 IEEE intl conf on parallel & distributed processing with applications, big data & cloud computing, sustainable computing & communications, social computing & networking (ISPA/BDCloud/SocialCom/SustainCom), IEEE, 2023, pp. 388–393. [19] N. Su and B. Li, “Asynchronous federated unlearning,” in IEEE INFOCOM 2023-IEEE conference on computer communications, IEEE, 2023, pp. 1–10. [20] Gong J, Simeone O, Kang J. Bayesian variational federated learning and unlearning in decentralized networks[C]//2021 IEEE 22nd International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). IEEE, 2021: 216-220. [21] Liu X, Li M, Wang X, et al. Decentralized federated unlearning on blockchain[J]. CoRR, 2024. [22] J. Macqueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of 5-th berkeley symposium on mathematical statistics and probability/university of california press, 1967. [23] X. Lian, C. Zhang, H. Zhang, C.-J. Hsieh, W. Zhang, and J. Liu, “Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent,” Advances in neural information processing systems, vol. 30, 2017. [24] M. Chen, Z. Zhang, T. Wang, M. Backes, M. Humbert, and Y. Zhang, “When machine unlearning jeopardizes privacy,” in Proceedings of the 2021 ACM SIGSAC conference on computer and communications security, 2021, pp. 896–911. [25] J. Xu, Z. Wu, C. Wang, and X. Jia, “Machine unlearning: Solutions and challenges,” IEEE Transactions on Emerging Topics in Computational Intelligence, 2024. [26] B. Le Bars, A. Bellet, M. Tommasi, E. Lavoie, and A.-M. Kermarrec, “Refined convergence and topology learning for decentralized sgd with heterogeneous data,” in International conference on artificial intelligence and statistics, PMLR, 2023, pp. 1672–1702. [27] M. Jaggi, “Revisiting frank-wolfe: Projection-free sparse convex optimization,” in International conference on machine learning, PMLR, 2013, pp. 427–435. [28] R. Burkard, M. Dell’Amico, and S. Martello, Assignment problems: Revised reprint. SIAM, 2012. [29] D. F. Crouse, “On implementing 2D rectangularassignment algorithms,” IEEE Transactions on Aerospace and Electronic Systems, vol. 52, no. 4, pp. 1679 1696, 2016. [30] A. Krizhevsky, G. Hinton, et al., “Learning multiple layers of features from tiny images,” 2009. [31] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, A. Y. Ng et al.,“Reading digits in natural images with unsupervised feature learning,”in NIPS workshop on deep learning and unsupervised feature learning, vol. 2011, no. 2. Granada, 2011, p. 4. [32] Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images[J]. 2009. [33] K. Hsieh, A. Phanishayee, O. Mutlu, and P. Gibbons, “The non-iid data quagmire of decentralized machine learning,” in International conference on machine learning, PMLR, 2020, pp. 4387–4398. [34] Arthur D, Vassilvitskii S. k-means++: The advantages of careful seeding[R]. Stanford, 2006.

Please choose a citation manager

Content to export