作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (4): 23-31. doi: 10.19678/j.issn.1000-3428.0064391

• 热点与综述 • 上一篇    下一篇

基于同态加密的隐私保护逻辑回归协同计算

杨越佳1, 华蓓2, 钟志威2, 高咪2   

  1. 1. 中国科学技术大学 网络空间安全学院, 合肥 230031;
    2. 中国科学技术大学 计算机科学与技术学院, 合肥 230031
  • 收稿日期:2022-04-07 修回日期:2022-06-08 发布日期:2022-08-22
  • 作者简介:杨越佳(1999-),女,硕士研究生,主研方向为安全多方计算;华蓓,教授、博士;钟志威、高咪,硕士研究生。
  • 基金资助:
    科技创新2030—“新一代人工智能”重大项目(2018AAA0101200)。

Collaborative Computing of Privacy-Preserving Logistic Regression Based on Homomorphic Encryption

YANG Yuejia1, HUA Bei2, ZHONG Zhiwei2, GAO Mi2   

  1. 1. School of Cyberspace Security, University of Science and Technology of China, Hefei 230031, China;
    2. School of Computer Science and Technology, University of Science and Technology of China, Hefei 230031, China
  • Received:2022-04-07 Revised:2022-06-08 Published:2022-08-22

摘要: 随着数据交易市场的建立和规范化,多方协同进行机器学习建模成为新需求。联邦学习允许多个数据拥有方联合训练一个机器学习模型,适用于模型共建共用场景,但现有联邦学习计算框架无法适用于数据拥有方和模型需求方诉求不同、模型共建不共用的场景。提出一种不依赖于第三方计算平台且基于同态加密的隐私保护逻辑回归协同计算方案,包括由数据拥有方、模型需求方和密钥生成者构成的多方协同计算框架,以及基于该框架的多方交互协同计算流程,在不泄露模型信息及各方数据隐私的前提下协作完成模型训练任务,通过建立攻击模型分析协同计算方案的安全性。基于先进的浮点数全同态加密方案CKKS在小型计算机集群上实现协同计算的原型系统,并对原型系统进行计算和通信优化,包括提前终止训练和将密文同态运算卸载到GPU上提高计算效率。实验结果表明,计算优化措施获得了约50倍的速度提升,协同计算原型系统在中小规模的数据集上可满足实用性要求。

关键词: 数据共享, 协同计算, 隐私保护计算, 同态加密, 逻辑回归

Abstract: With the establishment and standardization of data exchange markets, a new demand in which multiparties collaboratively train a machine learning model has emerged. Federated learning enables multiple data owners to jointly train a model, with the requirement that the model is trained and shared by all participants. The existing federated learning frameworks can not be applied to scenarios in which the data owners and model demander have different requirements and the model is jointly trained but not shared. A collaborative computing scheme for privacy-preserving logistic regression based on homomorphic encryption is proposed, which is independent of any third-party computing platforms. The collaborative compuing scheme includes a multiparty collaborative computing framework that comprises multiple data owners, a model demander, a key generator, and an interactive collaborative computing process based on the framework. With this framework, a model can be collaboratively trained without the leakage of model information or data privacy. The security of the collaborative computing scheme is analyzed by establishing an attack model. Based on the advanced floating-point fully homomorphic encryption scheme called the Cheon-Kim-Kim-Song(CKKS), a prototype system is implemented on a small computer cluster. This is optimized for calculations and communication, including the early termination of the training process and offloading the ciphertext homomorphic operations to the Graphics Processing Unit(GPU) to improve computational efficiency. The experimental results show that the computational optimizations can improve the system performance by approximately 50 times, and the prototype system can satisfy the practical requirements for small and medium-sized data sets.

Key words: data sharing, collaborative computing, privacy-preserving computing, homomorphic encryption, logistic regression

中图分类号: