Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Clustering-Based Decentralized Federated Unlearning Method

  

  • Published:2025-07-22

基于聚类的去中心化联邦遗忘方法

Abstract: Data privacy regulations mandate that machine learning models possess the capability to securely remove user data, enabling efficient responses to unlearning requests. However, in the context of Decentralized Federated Learning (DFL), unlearning is challenged by the high retraining costs resulting from multi-round parameter propagation, alongside client heterogeneity in computational power, communication bandwidth, and non-independent and identically distributed (Non-IID) data characteristics, which significantly impede model convergence. Existing research predominantly focuses on centralized federated learning scenarios, with limited exploration of unlearning mechanisms in DFL. To address this, this paper proposes a clustering-based decentralized federated unlearning method (CDFU), enhancing unlearning efficiency through client clustering and optimized topology learning. CDFU employs the proposed Hierarchical Clustering algorithm (HCK), which performs coarse and fine-grained clustering based on computational power, communication bandwidth, and data distribution features, forming time-synchronized and data-balanced clusters. This confines retraining to affected clusters, thereby reducing computational and communication overheads. Following clustering, CDFU utilizes a latency-optimized Topology Learning algorithm (LOTL) to construct and refine the intra-cluster communication topology, minimizing communication delays and accelerating model convergence. Experimental results demonstrate that CDFU significantly improves training efficiency across multiple datasets, achieving an average increase of 10.2% in test accuracy compared to existing methods.

摘要: 数据隐私法规要求机器学习模型具备安全移除用户数据的能力,以高效响应遗忘请求。然而,在去中心化联邦学习(DFL)环境中,遗忘学习面临多轮参数传播导致的重训练成本高昂问题,同时客户端计算能力、通信带宽异构性及非独立同分布(Non-IID)特性显著阻碍模型收敛。现有研究多集中于中心化联邦学习场景,对去中心化联邦遗忘机制的探索不足。为此,本文提出了一种基于聚类的去中心化联邦遗忘方法(CDFU),通过客户端聚类和高效的拓扑学习提升遗忘效率。CDFU采用本文设计的分层聚类算法(HCK),根据计算能力、通信带宽和数据分布特征进行粗聚类与细粒度划分,形成时间同步且数据分布均衡的集群,将遗忘重训练限定于受影响集群,从而降低计算和通信开销。划分完成后,CDFU通过时延优化的拓扑学习算法(LOTL)构建并优化集群内通信拓扑,减少通信延迟,加速模型收敛。实验表明,CDFU在多个数据集上显著提升训练效率,测试准确率平均提高10.2%,优于现有方法。