摘要: 为防止发布数据中敏感信息泄露,提出一种基于聚类的匿名保护算法。分析易被忽略的准标识符对敏感属性的影响,利用改进的K-means聚类算法对数据进行敏感属性聚类,使类内数据更相似。考虑等价类内敏感属性的多样性,对待发布表使用(K,L)-匿名算法进行聚类。实验结果表明,与传统K-匿名算法相比,该算法在实现隐私保护的同时,数据信息损失较少,执行时间较短。
关键词:
(K,
L)-匿名,
敏感属性,
隐私保护,
信息损失,
聚类,
K-means算法
Abstract: In order to prevent sensitive information leakage in the release data,this paper puts forward a kind of anonymous protection algorithm based on clustering.It takes the overlooked influnces of identifier to sensitive attributes into account,clusters the sensitive attribute of data,and makes the modified k-means clustering algorithm apply to this step,to make the data more similar in class.It uses (K,L)-anonymous method for tables which being published,considering of sensitive attribute in the equivalence class,and puts forward the effective methods for privacy protection.Experimental results show that the proposed model has good effect of privacy protection,compared with the traditional K-anonymous methods,it can achieve privacy protection,at the same time,reduce the loss of data information,make the data have a higher accuracy,and the executive time is shorter.
Key words:
(K,L)-anonymous,
sensitive attribute,
privacy protection,
information loss,
clustering,
K-means algorithm
中图分类号:
柴瑞敏,冯慧慧. 基于聚类的高效(K,L)-匿名隐私保护[J]. 计算机工程, 2015, 41(1): 139-142.
CHAI Ruimin,FENG Huihui. Efficient (K,L)-anonymous Privacy Protection Based on Clustering[J]. Computer Engineering, 2015, 41(1): 139-142.