摘要: 针对聚类中的特征选择问题,提出一种基于特征语义权重的数据聚类方法。该方法由用户指定必需的特征集,通过计算特征之间的语义相关度,选择和指定特征集相关的特征集作为补充。利用语义相关度确定各个特征的语义权重,在特征语义权重计算的基础上对传统的K-Means聚类算法进行改进,提出具有特征语义权重的FSW-KMeans算法。实验结果表明,FSW-KMeans算法较大地提高了聚类算法准确率和效率。
关键词:
本体,
特征语义权重,
语义相关度,
FSW-KMeans算法
Abstract: This paper proposes a data clustering method based on feature semantic weight for feature selection in clustering. The method acquires Must-Link set from user, and chooses the features which are relevant to the Must-Link as a supplement by calculating the semantic relativity and calculates feature semantic weight by the semantic relativity. It improves the traditional K-Means clustering algorithm based on the calculation of semantic relativity and presents FSW-KMeans clustering algorithm with feature semantics weight. Experimental results show that the clustering accuracy and efficiency of FSW-KMeans algorithm are improved.
Key words:
ontology,
feature semantic weight,
semantic relativity,
FSW-KMeans algorithm
中图分类号:
周川祥, 孟凡荣, 张磊, 王志愿. 具有特征语义权重的数据聚类方法[J]. 计算机工程, 2011, 37(4): 64-66.
ZHOU Chuan-Xiang, MENG Fan-Rong, ZHANG Lei, WANG Zhi-Yuan. Data Clustering Method with Feature Semantic Weight[J]. Computer Engineering, 2011, 37(4): 64-66.