作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (3): 182-190. doi: 10.19678/j.issn.1000-3428.0067954

• 网络空间安全 • 上一篇    下一篇

基于联邦学习的多技术融合数据交易方法

刘少杰1,2, 文斌1,2,*(), 王泽旭3   

  1. 1. 海南师范大学数据科学与智慧教育教育部重点实验室, 海南 海口 571158
    2. 海南师范大学信息科学技术学院, 海南 海口 571158
    3. 中山大学软件工程学院, 广东 珠海 519082
  • 收稿日期:2023-06-28 出版日期:2024-03-15 发布日期:2023-11-14
  • 通讯作者: 文斌
  • 基金资助:
    国家自然科学基金(62362029); 海南省自然科学基金(623RC485); 海南省自然科学基金(620RC605); 海南省研究生创新科研课题(Qhys2021-306)

Multi-Technology Fused Data Trading Method Based on Federated Learning

Shaojie LIU1,2, Bin WEN1,2,*(), Zexu WANG3   

  1. 1. Key Laboratory of Data Science and Smart Education,Ministry of Education,Hainan Normal University,Haikou 571158,Hainan,China
    2. School of Information Science and Technology,Hainan Normal University,Haikou 571158,Hainan,China
    3. School of Software Engineering,Sun Yat-sen University,Zhuhai 519082,Guangdong,China
  • Received:2023-06-28 Online:2024-03-15 Published:2023-11-14
  • Contact: Bin WEN

摘要:

数据保护的约束使得数据被限制在不同企业和组织之间,形成了众多“数据孤岛”,难以发挥其蕴含的重要价值。联邦学习的出现使得数据在组织之间共享成为可能,但利益分配方案不明确、通信成本高、中心化等问题使其难以满足数据交易场景的多方位需求。针对这些问题,提出一种基于联邦学习的多技术融合数据交易方法(MTFDT)。通过结合可信执行环境与沙普利值进行激励机制设计,并对交易过程中模型数据同步机制进行优化,提出一种基于树型拓扑结构的模型同步方案,使得同步时间复杂度由线性级降低至对数级。同时,设计基于区块链的利益分配数据和模型数据存储方案,使得交易过程信息不可篡改并能够通过溯源的方式进行追责。基于公开数据集进行仿真对比,实验结果表明,MTFDT能够实现模型训练效果的精确评估,提高利益分配的公平性,相比已有方案,模型同步时间消耗最多减少34%且对带宽要求更低。

关键词: 数据交易, 联邦学习, 区块链, 激励机制, 通信优化

Abstract:

The constraints of data protection have restricted data within different enterprises and organizations, forming several "data islands" that make it difficult to tap into their inherent important value. The emergence of Federated Learning(FL) has made data sharing between organizations possible. However, issues such as unclear benefit distribution schemes, high communication costs, and centralization make it difficult to meet the multifaceted demands of data trading scenarios. To address these issues, a Multi-Technology Fused Data Trading(MTFDT) method based on FL is proposed. In this method, the incentive mechanism is designed by combining trusted execution environments with the Shapley value. The model and data synchronization mechanism during trading are optimized using a tree-based topological structure-based model synchronization scheme, reducing the synchronization time complexity from linear to logarithmic. Simultaneously, blockchain-based benefit distribution data and model data storage solutions are designed to make the transaction information tamper-proof and accountable through traceability. Finally, simulations and comparisons are performed using public datasets. The experimental results demonstrate that MTFDT can achieve a precise evaluation of the model training effects and improve the fairness of the benefit distribution. Compared to existing solutions, the time consumption of model synchronization is reduced by up to 34%, and the bandwidth requirement is lower.

Key words: data transaction, Federated Learning(FL), blockchain, incentive system, communication optimization