Local Momentum Accelerated Based Non-IID Federated Learning Method

doi:10.19678/j.issn.1000-3428.0070254

Abstract

Abstract: Federated Learning (FL), a distributed machine learning technology, has achieved significant results in privacy protection. However, in practical applications, client drift phenomena occur because of the Non-Independent and Identically Distributed (Non-IID) nature of data sources, leading to slow model convergence and performance degradation. To address this issue, this study proposes a Federated Local Momentum accelerated learning (FedLM) algorithm combined with the attention mechanism. FedLM introduces a global momentum term into local model updates, utilizing the global gradient information from previous rounds to smooth the current update process and correct the divergence of parameter update directions among heterogeneous clients, thereby reducing gradient oscillations and alleviating data heterogeneity issues. The attention mechanism dynamically adjusts the weight of each client in the global model update to improve the quality of the aggregation model. Experimental results show that FedLM achieves significantly better accuracy and stability than existing federated learning algorithms such as SCAFFOLD, FedCM, and Moon in image classification tasks with different levels of data heterogeneity, model structures, and datasets.

Key words: Federated Learning(FL), data heterogeneity, attention mechanism, client drift, local momentum

摘要： 联邦学习(FL)作为一种分布式机器学习技术,在隐私保护方面取得了显著成果。然而,在实际应用中,由于数据源的非独立同分布(Non-IID)性,导致客户端漂移现象,从而引发模型收敛缓慢和性能下降问题。为此,提出一种结合注意力机制的联邦本地动量加速学习(FedLM)算法。FedLM在本地模型更新中引入全局动量项,利用前几轮的全局梯度信息来平滑当前的更新过程,修正异构客户端的参数更新方向分歧,从而减少梯度震荡,缓解数据异构性问题。注意力机制则通过动态调整各客户端在全局模型更新中的权重,以提升聚合模型的质量。实验结果表明,在不同数据异构程度、不同模型结构以及不同数据集的图像分类任务中,FedLM的准确率和稳定性均显著优于现有的SCAFFOLD、FedCM、Moon等联邦学习算法。

关键词: 联邦学习, 数据异构, 注意力机制, 客户端漂移, 本地动量

CLC Number:

TP311

YIN Hengjie, ZHENG Keqing, KE Jiannan, DONG Yunquan. Local Momentum Accelerated Based Non-IID Federated Learning Method[J]. Computer Engineering, 2026, 52(4): 103-110.

尹恒杰, 郑克清, 柯建楠, 董云泉. 基于本地动量加速的非独立同分布联邦学习方法[J]. 计算机工程, 2026, 52(4): 103-110.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0070254

https://www.ecice06.com/EN/Y2026/V52/I4/103

References

[1] AASOUM N, JELLOULI I, AMJAD S, et al. Security and privacy-preserving techniques of federated learning in edge computing: a comparative study[C]//Proceedings of the 4th International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET). Washington D.C., USA: IEEE Press, 2024: 1-8.
[2] ZHANG Y, ZHANG W L, PU L J, et al. To distill or not to distill: toward fast, accurate, and communication-efficient federated distillation learning[J]. IEEE Internet of Things Journal, 2024, 11(6): 10040-10053.
[3] 吴若岚, 陈玉玲, 豆慧, 等. 抗攻击的联邦学习隐私保护算法[J]. 计算机工程, 2025, 51(2): 179-187. WU R L, CHEN Y L, DOU H, et al. Privacy preserving algorithm using federated learning against attacks[J]. Computer Engineering, 2025, 51(2): 179-187. (in Chinese)
[4] CHADDAD A, WU Y H, DESROSIERS C. Federated learning for healthcare applications[J]. IEEE Internet of Things Journal, 2024, 11(5): 7339-7358.
[5] DU J H, QIN N, HUANG D Q, et al. Lightweight FL: a low-cost federated learning framework for mechanical fault diagnosis with training optimization and model pruning[J]. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 3504014.
[6] MIAO J X, YANG Z X, FAN L L, et al. FedSeg: class-heterogeneous federated learning for semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2023: 8042-8052.
[7] MA Z R, LIU Y, MIAO Y B, et al. FlGan: GAN-based unbiased federated learning under Non-IID settings[J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(4): 1566-1581.
[8] HIJAZI N M, ALOQAILY M, GUIZANI M, et al. Secure federated learning with fully homomorphic encryption for IoT communications[J]. IEEE Internet of Things Journal, 2024, 11(3): 4289-4300.
[9] LI T, SAHU A K, TALWALKAR A, et al. Federated learning: challenges, methods, and future directions[J]. IEEE Signal Processing Magazine, 2020, 37(3): 50-60.
[10] KARIMIREDDY S P, KALE S, MOHRI M, et al. SCAFFOLD: stochastic controlled averaging for federated learning[C]//Proceedings of the 37th International Conference on Machine Learning. Washington D.C., USA:IEEE Press, 2020: 5132-5143.
[11] 宋华伟, 李升起, 万方杰, 等. 非独立同分布场景下的联邦学习优化方法[J]. 计算机工程, 2024, 50(3): 166-172. SONG H W, LI S Q, WAN F J, et al. Federated learning optimization method in Non-IID scenarios[J]. Computer Engineering, 2024, 50(3): 166-172. (in Chinese)
[12] LIU W, CHEN L, CHEN Y F, et al. Accelerating federated learning via momentum gradient descent[J]. IEEE Transactions on Parallel and Distributed Systems, 2020, 31(8): 1754-1766.
[13] BRAUWERS G, FRASINCAR F. A general survey on attention mechanisms in deep learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(4): 3279-3298.
[14] LIU Y L, GAO Y, YIN W T. An improved analysis of stochastic gradient descent with momentum[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2020: 18261-18271.
[15] ZHOU H H, LAN T, VENKATARAMANI G, et al. Every parameter matters: ensuring the convergence of federated learning with dynamic heterogeneous models reduction[EB/OL].[2024-06-05]. https://arxiv.org/abs/2310.08670.
[16] SABUG L, FAGIANO L, RUIZ F. Sample-based trust region dynamics in contextual global optimization[J]. IEEE Control Systems Letters, 2024, 8: 1619-1624.
[17] MCMAHAN H B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[EB/OL].[2024-06-05]. https://arxiv.org/abs/1602.05629.
[18] YAN B J, LIU B Y, WANG L J, et al. FedCM: a real-time contribution measurement method for participants in federated learning[C]//Proceedings of the International Joint Conference on Neural Networks (IJCNN). Washington D.C., USA: IEEE Press, 2021: 1-8.
[19] LI Q B, HE B S, SONG D. Model-contrastive federated learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2021: 10708-10717.
[20] GAO L, FU H Z, LI L, et al. FedDC: federated learning with Non-IID data via local drift decoupling and correction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2022: 10102-10111.
[21] KIM G, KIM J, HAN B. Communication-efficient federated learning with accelerated client gradient[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2024: 12385-12394.
[22] SUN Y, SHEN L, HUANG T S, et al. FedSpeed: larger local interval, less communication round, and higher generalization accuracy[EB/OL].[2024-06-05]. https://arxiv.org/abs/2302.10429.
[23] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[24] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2016: 770-778.
[25] IQBAL M, LI R J, XU J Y. Behavior analysis of photo-taking tourists at attraction-level using deep learning and latent Dirichlet allocation in conjunction with kernel density estimation[J]. IEEE Access, 2024, 12: 92945-92959.

Please choose a citation manager

Content to export