基于超级账本的集群联邦优化模型

doi:10.19678/j.issn.1000-3428.0064301

计算机工程 ›› 2023, Vol. 49 ›› Issue (1): 22-30. doi: 10.19678/j.issn.1000-3428.0064301

基于超级账本的集群联邦优化模型

李尤慧子¹, 俞海涛¹, 殷昱煜¹, 高洪皓^2,3

1. 杭州电子科技大学计算机学院, 杭州 310018;
2. 上海大学计算机工程与科学学院, 上海 200444;
3. 韩国嘉泉大学计算机工程系, 城南 461701

收稿日期:2022-03-25 修回日期:2022-10-14 发布日期:2022-11-16
作者简介:李尤慧子(1989-),女,副教授、博士,主研方向为边缘计算、隐私保护;俞海涛,硕士研究生;殷昱煜、高洪皓,教授、博士。
基金资助:
浙江省“尖兵”“领雁”研发攻关计划“基于异质多模态数据融合的卒中后认知功能障碍智能诊疗技术研究与应用示范”（2022C03043）；浙江省自然科学基金“车联网中全路径轨迹隐私保护关键技术研究”（LY22F020018）。

Cluster Federated Optimization Model Based on Hyperledger Fabric

LI Youhuizi¹, YU Haitao¹, YIN Yuyu¹, GAO Honghao^2,3

1. School of Computer Science, Hangzhou Dianzi University, Hangzhou 310018, China;
2. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China;
3. Department of Computer Engineering, Gachon University, Seongnam 461701, Republic of Korea

Received:2022-03-25 Revised:2022-10-14 Published:2022-11-16

摘要/Abstract

摘要： 联邦学习作为分布式机器学习框架，在数据不离开本地的情况下，通过共享模型参数达到协作训练的目标，一定程度上解决了隐私保护问题，但其存在中心参数服务器无法应对单点故障、潜在恶意客户端梯度攻击、客户端数据偏态分布导致训练性能低下等问题。将去中心化的区块链技术与联邦学习相结合，提出基于超级账本的集群联邦优化模型。以超级账本作为分布式训练的架构基础，客户端初始化后在本地训练向超级账本传输模型参数及分布信息，通过聚类优化联邦学习模型在客户端数据非独立同分布下的训练表现。在此基础上，随机选举客户端成为领导者，由领导者代替中央服务器的功能，领导者根据分布相似度和余弦相似度聚类并下载模型参数聚合，最后客户端获取聚合模型继续迭代训练。以EMNIST数据集为例，数据非独立同分布情况下该模型平均准确率为79.26%，较FedAvg提高17.26%，在保证准确率的前提下，较集群联邦学习训练至收敛的通信轮次减少36.3%。

关键词: 区块链, 超级账本, 联邦学习, 隐私保护, 智能合约

Abstract: As a distributed machine learning framework, Federated Learning(FL) achieves collaborative training by sharing model parameters without leaving the data locally and ensures privacy protection to a certain extent. However, several challenges are encountered in FL, such as potential malicious client gradient attacks, inability of the central parameter server to cope with a single point of failure, and poor training performance due to skewed client data distribution. In response to the above-mentioned problems, the decentralized blockchain technology is combined with FL and a cluster federated optimization model is proposed based on Hyperledger Fabric. The model uses Hyperledger Fabric as the architectural basis for distributed training. After the client is initialized, the local training transmits model parameters and distribution information to the Hyperledger Fabric. The training performance of FL under the Non-IID distribution of client data is optimized through clustering; then, a client is randomly elected to become the leader; the leader replaces the central server. The leader clusters and downloads the model parameter aggregation according to the distribution similarity and cosine similarity. The client obtains the aggregation model and iterative training continues. On the EMNIST dataset, the average accuracy of the proposed model is 79.26% under Non-IID distribution of data, which is 17.26% higher than that of FedAvg. For a high rate, it requires 36.3% less time than the communication round of Cluster Federated Learning(CFL) training to convergence.

Key words: blockchain, Hyperledger Fabric, Federated Learning(FL), privacy protection, smart contract

中图分类号:

TP391

李尤慧子, 俞海涛, 殷昱煜, 高洪皓. 基于超级账本的集群联邦优化模型[J]. 计算机工程, 2023, 49(1): 22-30.

LI Youhuizi, YU Haitao, YIN Yuyu, GAO Honghao. Cluster Federated Optimization Model Based on Hyperledger Fabric[J]. Computer Engineering, 2023, 49(1): 22-30.

https://www.ecice06.com/CN/Y2023/V49/I1/22

图/表 9

20230701174807

20230701174810

20230701174814

20230701174817

20230701174820

20230701174825

20230701174828

20230701174831

20230701174834

参考文献

[1] RADFORD A, NARASIMHAN K, SALIMANS T, et al.Improving language understanding by generative pre-training[EB/OL].[2022-04-27].https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
[2] RADFORD A, WU J, CHILD R, et al.Language models are unsupervised multitask learners[EB/OL].[2022-04-27].https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.
[3] BROWN T B, MANN B, RYDER N, et al.Language models are few-shot learners[EB/OL].[2022-04-27].https://arxiv.org/pdf/2005.14165.pdf.
[4] WARNAT-HERRESTHAL S, SCHULTZE H, SHASTRY K L, et al.Swarm learning for decentralized and confidential clinical machine learning[J].Nature, 2021, 594(7862):265-270.
[5] JIANG J C, KANTARCI B, OKTUG S, et al.Federated learning in smart city sensing:challenges and opportunities[J].Sensors(Basel, Switzerland), 2020, 20(21):6230.
[6] LIM W Y B, LUONG N C, HOANG D T, et al.Federated learning in mobile edge networks:a comprehensive survey[J].IEEE Communications Surveys & Tutorials, 2020, 22(3):2031-2063.
[7] YANG Q, LIU Y, CHEN T J, et al.Federated machine learning[J].ACM Transactions on Intelligent Systems and Technology, 2019, 10(2):1-19.
[8] ZHU L, LIU Z, HAN S.Deep leakage from gradients[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2019:14774-14784.
[9] MELIS L, SONG C Z, DE CRISTOFARO E, et al.Exploiting unintended feature leakage in collaborative learning[C]//Proceedings of IEEE Symposium on Security and Privacy.Washington D.C., USA:IEEE Press, 2019:691-706.
[10] YAN X D, CUI B J, XU Y, et al.A method of information protection for collaborative deep learning under GAN model attack[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2021, 18(3):871-881.
[11] HITAJ B, ATENIESE G, PEREZ-CRUZ F.Deep models under the GAN:information leakage from collaborative deep learning[C]//Proceedings of 2017 ACM SIGSAC Conference on Computer and Communications Security.New York, USA:ACM Press, 2017:603-618.
[12] PYRGELIS A, TRONCOSO C, DE CRISTOFARO E.Knock knock who's there?Membership inference on aggregate location data[EB/OL].[2022-04-27].https://arxiv.org/pdf/1708.06145.pdf.
[13] SATTLER F, WIEDEMANN S, MÜLLER K R, et al.Robust and communication-efficient federated learning from non-i.i.d.data[J].IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(9):3400-3413.
[14] SAHU A K, LI T, SANJABI M, et al.On the convergence of federated optimization in heterogeneous networks[EB/OL].[2022-04-27].https://arxiv.org/pdf/1812. 06127.pdf.
[15] LI X, HUANG K, YANG W, et al.On the convergence of FedAvg on non-iid data[EB/OL].[2022-04-27].https://arxiv.org/pdf/1907.02189.pdf.
[16] MCMAHAN H B, MOORE E, RAMAGE D, et al.Communication-efficient learning of deep networks from decentralized data[EB/OL].[2022-04-27].https://arxiv.org/abs/1602.05629.
[17] ZHAO Y, LI M, LAI L, et al.Federated learning with non-IID data[EB/OL].[2022-04-27].https://arxiv.org/pdf/1806.00582.pdf.
[18] 朱建明, 张沁楠, 高胜, 等.基于区块链的隐私保护可信联邦学习模型[J].计算机学报, 2021, 44(12):2464-2484. ZHU J M, ZHANG Q N, GAO S, et al.Privacy preserving and trustworthy federated learning model based on blockchain[J].Chinese Journal of Computers, 2021, 44(12):2464-2484.(in Chinese)
[19] SATOSHI N.Bitcoin:a peer-to-peer electronic cash system[EB/OL].[2022-04-27].https://bitcoin.org/en/bitcoin-paper.
[20] The Technical Working Group China.Fabric-documentation[EB/OL].[2022-04-27].https://hyperledger-fabric.readthedocs.io/zh_CN/release-2.2/index.html.
[21] KONEČNÝ J, MCMAHAN B, RAMAGE D, et al.Federated optimization:distributed optimization beyond the datacenter[EB/OL].[2022-04-27].https://arxiv.org/pdf/1511.03575.pdf.
[22] GHOSH A, CHUNG J, YIN D, et al.An efficient framework for clustered federated learning[J].Advances in Neural Information Processing Systems, 2020, 33:19586-19597.
[23] LI X C, ZHAN D C.FedRS:federated learning with Restricted Softmax for label distribution non-IID data[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.New York, USA:ACM Press, 2021:995-1005.
[24] SATTLER F, MÜLLER K R, SAMEK W.Clustered federated learning:model-agnostic distributed multitask optimization under privacy constraints[J].IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(8):3710-3722.
[25] DUAN M M, LIU D, JI X Y, et al.Flexible clustered federated learning for client-level data distribution shift[J].IEEE Transactions on Parallel and Distributed Systems, 2022, 33(11):2661-2674.
[26] BAO X, SU C, XIONG Y, et al.FLChain:a blockchain for auditable federated learning with trust and incentive[C]//Proceedings of the 5th International Conference on Big Data Computing and Communications.Washington D.C., USA:IEEE Press, 2019:151-159.
[27] LECUN Y, BOTTOU L, BENGIO Y, et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE, 1998, 86(11):2278-2324.

选择文件类型/文献管理软件名称

选择包含的内容

基于超级账本的集群联邦优化模型

Cluster Federated Optimization Model Based on Hyperledger Fabric

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	郑清安, 董建成, 陈亮, 阮英清, 李锦松, 许林彬. 分布式可信数据管理与隐私保护技术研究[J]. 计算机工程, 2024, 50(7): 174-186.
[2]	刘寅昊, 蒋文保, 孙林昆, 王勇攀. 基于路径存储表的Hashgraph共识算法优化与实现[J]. 计算机工程, 2024, 50(6): 166-178.
[3]	顾永跟, 高凌轩, 吴小红, 陶杰. 非独立同分布下联邦半监督学习的数据分享研究[J]. 计算机工程, 2024, 50(6): 188-196.
[4]	熊世强, 何道敬, 王振东, 杜润萌. 联邦学习及其安全与隐私保护研究综述[J]. 计算机工程, 2024, 50(5): 1-15.
[5]	旋逸昭, 赵红武, 金瑜. 一种基于双链的区块链共识机制[J]. 计算机工程, 2024, 50(5): 139-148.
[6]	顾永跟, 李国笑, 吴小红, 陶杰, 张艳琼. 预算约束下多任务联邦学习激励机制[J]. 计算机工程, 2024, 50(5): 149-157.
[7]	王栋, 王合建, 玄佳兴, 郑尚卓, 陈炳聪. 面向电力调度指令的区块链隐私可追踪存证方案[J]. 计算机工程, 2024, 50(5): 158-166.
[8]	卢晓天, 朴春慧, 杨兴雨, 白英杰. 基于贝叶斯网络的差分隐私高维数据发布技术研究[J]. 计算机工程, 2024, 50(5): 167-181.
[9]	陈纪成, 包子健, 罗敏, 何德彪. 一种面向工业物联网的远程安全指令控制方案[J]. 计算机工程, 2024, 50(3): 28-35.
[10]	李宝莹, 李志淮, 王成爱, 杨锋. 自适应节点规模的区块链分片可扩展模型[J]. 计算机工程, 2024, 50(3): 137-147.
[11]	张晓均, 李兴鹏, 唐伟, 郝云溥, 薛婧婷. 云-边融合的可验证隐私保护跨域联邦学习方案[J]. 计算机工程, 2024, 50(3): 148-155.
[12]	宋华伟, 李升起, 万方杰, 卫玉萍. 非独立同分布场景下的联邦学习优化方法[J]. 计算机工程, 2024, 50(3): 166-172.
[13]	刘少杰, 文斌, 王泽旭. 基于联邦学习的多技术融合数据交易方法[J]. 计算机工程, 2024, 50(3): 182-190.
[14]	叶晓东, 赵迎迎, 孙永奇, 赵思聪, 刘真. 基于非定长编码和滑动窗口的隐私保护记录链接方法[J]. 计算机工程, 2024, 50(2): 154-164.
[15]	郑晨俊, 曾艳, 袁俊峰, 张纪林, 王鑫, 韩猛. 基于联邦学习的船舶AIS轨迹预测算法[J]. 计算机工程, 2024, 50(2): 298-307.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于超级账本的集群联邦优化模型

Cluster Federated Optimization Model Based on Hyperledger Fabric

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献

相关文章 15

编辑推荐

Metrics

本文评价