面向非独立同分布数据的联邦学习架构

doi:10.19678/j.issn.1000-3428.0064016

摘要/Abstract

摘要：

在超大规模边缘设备参与的联邦学习场景中，参与方本地数据为非独立同分布，导致总体训练数据不均衡且毒药攻击防御困难。有监督学习中增强数据均衡的多数方法所要求的先验知识与联邦学习的隐私保护原则发生冲突，而针对非独立同分布场景中的毒药攻击，现有的防御算法则过于复杂或侵害数据隐私。提出一种多服务器架构FedFog，其能在不泄露参与方本地数据分布的前提下，对数据分布相似的参与方进行聚类，将非独立同分布的训练数据转换成多个独立同分布的数据子集。基于各聚类中心，全局服务器计算出从各类别数据中提取的特征在全局模型更新时的权重，从而缓解总体训练数据不均衡的负面影响。同时，将毒药攻击防御任务从参与方全集分配至每个聚类内部，从而解决毒药攻击防御问题。实验结果表明：在总体训练数据不均衡的场景中，FedFog的全局模型精度相较FedSGD最多获得了4.2个百分点的提升；在总体数据均衡但1/3的参与方为毒药攻击者的场景中，FedFog的收敛性接近于无毒药攻击场景中的FedSGD。

关键词: 非独立同分布, 隐私保护, 聚类, 数据均衡, 毒药攻击防御

Abstract:

In the scenarios of federated learning involving ultra-large-scale edge devices, the local data of participants are non-Independent Identically Distribution(non-IID) pattern, resulting in an imbalance in overall training data and difficulty in defending against poison attacks.The prior knowledge required by most methods to enhance the data balance in supervised learning conflicts with the privacy protection principle of federated learning.Furthermore, existing defense algorithms for poison attacks defense in non-IID scenarios are overly complex or violate data privacy.This study introduces FedFog, a multi-server architecture, capable of clustering participants with similar data distributions without disclosing the participants' local data distribution, and converting non-IID training data into multiple IID data subsets. Based on each cluster center, the global server calculates the weight of the features extracted from each category of data in the global model update to alleviate the negative impact of the overall training data imbalance.Simultaneously, FedFog assigns poison attack defense tasks from the entire set of participants to each cluster, thereby solving the problem of poison attack defense.The experimental results show that FedFog improves global model precision by up to 4.2 percentage points compared to FedSGD when the overall training data are not balanced.The convergence of FedFog in the scenario where the overall data are balanced but 1/3 of the participants are poison attackers approaches that of FedSGD in the no-poison attack scenario.

Key words: non-Independent Identically Distribution(non-IID), privacy protection, clustering, data balance, poison attack defense

邱天晨, 郑小盈, 祝永新, 封松林. 面向非独立同分布数据的联邦学习架构[J]. 计算机工程, 2023, 49(7): 110-117.

Tianchen QIU, Xiaoying ZHENG, Yongxin ZHU, Songlin FENG. Federated Learning Architecture for Non-IID Data[J]. Computer Engineering, 2023, 49(7): 110-117.

https://www.ecice06.com/CN/Y2023/V49/I7/110

图/表 10

参考文献 32

1	VOIGT P, VON DEM BUSSCHE A. Scope of application of the GDPR[M]//VOIGT P, VON DEM BUSSCHE A. The EU general data protection regulation. Berlin, Germany: Springer, 2017: 9-30.
2	MCMAHAN H B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[EB/OL]. [2022-01-02]. https://arxiv.org/abs/1602.05629.
3	ZHAO Y, LI M, LAI L Z, et al. Federated learning with non-IID data[EB/OL]. [2022-01-02]. https://arxiv.org/abs/1806.00582.
4	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 2999-3007.
5	WANG K Y, ZHANG L. Single-shot two-pronged detector with rectified IoU loss[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York, USA: ACM Press, 2020: 1311-1319.
6	HE H B, GARCIA E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263- 1284. doi: 10.1109/TKDE.2008.239
7	TOLPEGIN V, TRUEX S, GURSOY M E, et al. Data poisoning attacks against federated learning systems[C]//Proceedings of European Symposium on Research in Computer Security. Berlin, Germany: Springer, 2020: 480-501.
8	BLANCHARD P, EL MHAMDI E M, GUERRAOUI R, et al. Byzantine tolerant gradient descent for distributed machine learning with adversaries: US20200380340[P]. 2020-12-03.
9	SHAYAN M, FUNG C, YOON C J M, et al. Biscotti: a blockchain system for private and secure federated learning. IEEE Transactions on Parallel and Distributed Systems, 2021, 32(7): 1513- 1525. doi: 10.1109/TPDS.2020.3044223
10	VAN HULSE J, KHOSHGOFTAAR T M, NAPOLITANO A. Experimental perspectives on learning from imbalanced data[C]//Proceedings of the 24th International Conference on Machine Learning. New York, USA: ACM Press, 2007: 935-942.
11	MANI I, ZHANG I. kNN approach to unbalanced data distributions: a case study involving information extraction[C]//Proceedings of Workshop on Learning from Imbalanced Datasets. [S. l. ]: ICML, 2003: 1-7.
12	LEE H S, PARK M, KIM J. Plankton classification on imbalanced large scale database via convolutional neural networks with transfer learning[C]//Proceedings of IEEE International Conference on Image Processing. Washington D. C., USA: IEEE Press, 2016: 3713-3717.
13	POUYANFAR S, TAO Y D, MOHAN A, et al. Dynamic sampling in convolutional neural networks for imbalanced data classification[C]//Proceedings of IEEE Conference on Multimedia Information Processing and Retrieval. Washington D. C., USA: IEEE Press, 2018: 112-117.
14	LING C X, SHENG V S. Cost-sensitive learning and the class imbalance problem[M]//SAMMUT C. Encyclopedia of machine learning. Berlin, Germany: Springer, 2008: 231-235.
15	CUI Y, JIA M L, LIN T Y, et al. Class-balanced loss based on effective number of samples[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 9260-9269.
16	ELKAN C. The foundations of cost-sensitive learning[C]//Proceedings of the 17th International Joint Conference on Artificial Intelligence. New York, USA: ACM Press, 2001: 973-978.
17	LI B Y, LIU Y, WANG X G. Gradient harmonized single-stage detector. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 8577- 8584. doi: 10.1609/aaai.v33i01.33018577
18	LUO H S, JI L, LI T R, et al. GRACE: gradient harmonized and cascaded labeling for aspect-based sentiment analysis[EB/OL]. [2022-01-02]. https://arxiv.org/abs/2009.10557.
19	LIU X Y, WU J X, ZHOU Z H. Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B(Cybernetics), 2009, 39(2): 539- 550. doi: 10.1109/TSMCB.2008.2007853
20	CHAWLA N V, LAZAREVIC A, HALL L O, et al. SMOTEBoost: improving prediction of the minority class in boosting[C]//Proceedings of European Conference on Principles of Data Mining and Knowledge Discovery. Berlin, Germany: Springer, 2003: 107-119.
21	DUAN M M, LIU D, CHEN X Z, et al. Self-balancing federated learning with global imbalanced data in mobile systems. IEEE Transactions on Parallel and Distributed Systems, 2021, 32(1): 59- 71. doi: 10.1109/TPDS.2020.3009406
22	MOHRI M, SIVEK G, SURESH A T. Agnostic federated learning[C]//Proceedings of International Conference on Machine Learning. [S. l. ]: PMLR, 2019: 4615-4625.
23	CHEN F, LUO M, DONG Z H, et al. Federated meta-learning with fast convergence and efficient communication[EB/OL]. [2022-01-02]. https://arxiv.org/abs/1802.07876.
24	LIU Y, AI Z P, SUN S, et al. FedCoin: a peer-to-peer payment system for federated learning[M]//YANG Q, FAN L, YU H. Federated learning. Berlin, Germany: Springer, 2020: 125-138.
25	KANG J W, XIONG Z H, NIYATO D, et al. Incentive mechanism for reliable federated learning: a joint optimization approach to combining reputation and contract theory. IEEE Internet of Things Journal, 2019, 6(6): 10700- 10714. doi: 10.1109/JIOT.2019.2940820
26	FUNG C, YOON C J M, BESCHASTNIKH I. Mitigating sybils in federated learning poisoning[EB/OL]. [2022-01-02]. https://arxiv.org/abs/1808.04866.
27	LI M, WANG Q G, ZHANG W L. Blockchain-based distributed machine learning towards statistical challenges[C]//Proceedings of International Conference on Blockchain and Trustworthy Systems. Berlin, Germany: Springer, 2020: 549-564.
28	ZHAI K, REN Q, WANG J L, et al. Byzantine-robust federated learning via credibility assessment on non-IID data[EB/OL]. [2022-01-02]. https://arxiv.org/abs/2109.02396.
29	KRISHNA K, NARASIMHA MURTY M. Genetic K-means algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B(Cybernetics), 1999, 29(3): 433- 439. doi: 10.1109/3477.764879
30	王琦, 曹卫权, 梁杰, 等. 面向端到端溯源攻击对手的Tor安全性模型. 计算机工程, 2021, 47(11): 136- 143. URL
	WANG Q, CAO W Q, LIANG J, et al. Tor security model for end-to-end source tracking attack adversary. Computer Engineering, 2021, 47(11): 136- 143. URL
31	GENTRY C, BONEH D. A fully homomorphic encryption scheme. Palo Alto, USA: Stanford University Press, 2009.
32	CRAMER R, DAMGÅRD I B, NIELSEN J B. Secure multiparty computation and secret sharing. Cambridge, UK: Cambridge University Press, 2015.

参数	说明
C	数据集的标签种类数
$ {c}_{i} $	参与方
$ \left\|c\right\| $	参与方数量
\|B\|	本地训练的batch大小
$ {N}_{i} $	$ {c}_{i} $的本地数据集
$ \left\|{N}_{i}\right\| $	$ {N}_{i} $的大小
$ {\boldsymbol{d}}_{i} $	$ {c}_{i} $的本地数据分布向量
$ {s}_{j} $	子服务器
$ \left\|s\right\| $	子服务器数量，$ \left\|s\right\|=C $
$ f $	$ c $到其所属$ s $的映射
$ {\boldsymbol{g}}_{i}^{t} $	t周期$ {c}_{i} $发送给$ f\left({c}_{i}\right) $的梯度信息
$ {\boldsymbol{m}}_{j} $	$ {s}_{j} $的聚类中心向量
$ {q}_{j} $	$ {s}_{j} $管理c的公钥的集合
$ \left\|{q}_{j}\right\| $	$ {s}_{j} $管理c的个数
$ {v}_{j}^{k} $	$ {s}_{j} $管理的第k个参与方
$ G $	全局服务器
$ {K}_{\mathrm{p}\mathrm{u}\mathrm{b}}(\cdot ) $	公钥加密、解签
$ {K}_{\mathrm{s}\mathrm{e}\mathrm{c}}(\cdot ) $	私钥解密、签名
T	全局训练周期数
$ {\boldsymbol{G}}_{j}^{t} $	t周期$ {s}_{j} $发送给$ G $的梯度信息
$ \alpha $	学习率
$ {w}_{\mathrm{g}\mathrm{l}\mathrm{o}\mathrm{b}\mathrm{a}\mathrm{l}}^{t} $	t周期全局模型参数
$ {h}_{j} $	计算全局梯度时$ {G}_{j}^{t} $的权重
r	Krum算法计算的相邻向量个数
k	Krum算法取前k个最低分
$ (\mathrm{ }\cdot \mathrm{ }\|\cdot ) $	将多个数据打包
$ {K}_{s;\mathrm{p}\mathrm{u}\mathrm{b}}^{\mathrm{H}} $	s用于同态加密的公钥
$ {K}_{s;\mathrm{s}\mathrm{e}\mathrm{c}}^{\mathrm{H}} $	s用于同态加密的私钥

参数	说明
C	数据集的标签种类数
$ {c}_{i} $	参与方
$ \left\|c\right\| $	参与方数量
\|B\|	本地训练的batch大小
$ {N}_{i} $	$ {c}_{i} $的本地数据集
$ \left\|{N}_{i}\right\| $	$ {N}_{i} $的大小
$ {\boldsymbol{d}}_{i} $	$ {c}_{i} $的本地数据分布向量
$ {s}_{j} $	子服务器
$ \left\|s\right\| $	子服务器数量，$ \left\|s\right\|=C $
$ f $	$ c $到其所属$ s $的映射
$ {\boldsymbol{g}}_{i}^{t} $	t周期$ {c}_{i} $发送给$ f\left({c}_{i}\right) $的梯度信息
$ {\boldsymbol{m}}_{j} $	$ {s}_{j} $的聚类中心向量
$ {q}_{j} $	$ {s}_{j} $管理c的公钥的集合
$ \left\|{q}_{j}\right\| $	$ {s}_{j} $管理c的个数
$ {v}_{j}^{k} $	$ {s}_{j} $管理的第k个参与方
$ G $	全局服务器
$ {K}_{\mathrm{p}\mathrm{u}\mathrm{b}}(\cdot ) $	公钥加密、解签
$ {K}_{\mathrm{s}\mathrm{e}\mathrm{c}}(\cdot ) $	私钥解密、签名
T	全局训练周期数
$ {\boldsymbol{G}}_{j}^{t} $	t周期$ {s}_{j} $发送给$ G $的梯度信息
$ \alpha $	学习率
$ {w}_{\mathrm{g}\mathrm{l}\mathrm{o}\mathrm{b}\mathrm{a}\mathrm{l}}^{t} $	t周期全局模型参数
$ {h}_{j} $	计算全局梯度时$ {G}_{j}^{t} $的权重
r	Krum算法计算的相邻向量个数
k	Krum算法取前k个最低分
$ (\mathrm{ }\cdot \mathrm{ }\|\cdot ) $	将多个数据打包
$ {K}_{s;\mathrm{p}\mathrm{u}\mathrm{b}}^{\mathrm{H}} $	s用于同态加密的公钥
$ {K}_{s;\mathrm{s}\mathrm{e}\mathrm{c}}^{\mathrm{H}} $	s用于同态加密的私钥

输入	输出	激活函数
28×28像素的图片	长度为300的向量	ReLU
长度为300的向量	长度为100的向量	ReLU
长度为100的向量	长度为10的向量	—

输入	输出	激活函数
28×28像素的图片	长度为300的向量	ReLU
长度为300的向量	长度为100的向量	ReLU
长度为100的向量	长度为10的向量	—

少数类别数与总类别数/个	总体训练数据量/张	参与方数量/张
2，10	49 200	82
8，10	16 800	28
9，10	10 400	19
0，10（对照）	60 000	100

选择文件类型/文献管理软件名称

选择包含的内容