作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (12): 202-209. doi: 10.19678/j.issn.1000-3428.0068685

• 网络空间安全 • 上一篇    下一篇

面向长尾数据隐私保护的真值发现方法研究

文爱军1, 刘泽三1, 王振亚1,2,*(), 付成花1   

  1. 1. 国网信息通信产业集团有限公司信通研究院, 北京 100052
    2. 北京邮电大学网络与交换国家重点实验室, 北京 100876
  • 收稿日期:2023-10-24 修回日期:2024-02-27 出版日期:2025-12-15 发布日期:2024-08-19
  • 通讯作者: 王振亚
  • 基金资助:
    国家重点研发计划(2022YFB2403900)

Research on Truth Discovery Method for Privacy Protection of Long-Tail Data

WEN Aijun1, LIU Zesan1, WANG Zhenya1,2,*(), FU Chenghua1   

  1. 1. Information and Telecommunication Research Institute, State Grid Information and Telecommunication Group Co., Ltd., Beijing 100052, China
    2. State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2023-10-24 Revised:2024-02-27 Online:2025-12-15 Published:2024-08-19
  • Contact: WANG Zhenya

摘要:

基于群智感知技术, 可以方便地从大量的感知设备中收集传感数据。然而, 由于存在设备精度、背景噪声等问题, 群智感知任务参与者的感知结果通常是不可靠的。真值发现技术可以通过评估参与者的可信度, 从不可靠的数据中提取更为真实可靠的感知结果。但是现有的真值发现方法无法为参与者提供充分的数据隐私保障, 存在隐私泄露风险, 或者未考虑实际场景中可能存在的长尾效应问题, 数据效用有待提高。针对上述问题, 提出一种面向长尾数据隐私保护的真值发现方法, 该方法仅在初始化阶段要求参与者进行一定的运算操作, 而不要求参与者在协议执行过程中保持在线。同时, 该方法采用姚氏混淆电路技术作为隐私保护手段, 全程在密文下由计算服务器完成真值发现, 不会暴露任何输入数据和中间计算结果。此外, 通过对真值发现过程的基础运算操作进行离线预计算, 进一步提升其在线执行效率。理论和实验分析表明, 该方法可以在保障数据效用的同时为每个参与者提供充足的隐私保障。

关键词: 隐私计算, 群智感知, 真值发现, 长尾数据, 混淆电路

Abstract:

Crowd-sensing technology allows to conveniently collect data from a large number of sensing devices. However, owing to issues such as device accuracy and background noise, the perception results of participants in crowdsensing tasks are often unreliable. True value discovery technology can extract more authentic and reliable perception results from unreliable data by evaluating the credibility of the participants. However, existing truth discovery methods cannot provide sufficient data privacy protection for participants. There remains a risk of privacy leakage or failure to consider the long-tail effect that may exist in practical scenarios, and the data utility must be improved. To address the above issues, this paper proposes a truth discovery method for the privacy protection of long-tail data. This method only requires participants to perform certain arithmetic operations during the initialization phase and does not require them to remain online during protocol execution. Yao's confusion circuit technology is employed as a privacy protection measure, and true value discovery under ciphertext is realized by the computing server throughout the process without exposing any input data or intermediate calculation results. In addition, the online execution efficiency can be further improved by performing offline pre-calculations on the basic operations of the truth discovery process. Theoretical and experimental analyses show that this method can provide sufficient privacy protection for each participant while ensuring data utility.

Key words: privacy computing, crowd sensing, truth discovery, long-tail data, garbled circuit