摘要:
针对CABOSFV聚类算法对数据输入顺序的敏感性问题,提出融合排序思想的高属性维稀疏数据聚类算法,通过计算首次聚类中两两高属性维稀疏数据非零属性取值情况确定所需要计算差异度的集合组合,减小了算法复杂度。应用结果表明,该方法能提高CABOSFV聚类的质量。
关键词:
高维稀疏数据,
CABOSFV聚类,
排序
Abstract:
In the light of the sensitivity of the order of data input by CABOSFV clustering algorithm, this paper puts forward a high attribute dimensional sparse clustering algorithm of the integration of sorting. The method of how to determine the two sets calculates the difference between two high dimensional sparse data sets in the first clustering, the algorithm complexity is reduced. The method improves the quality and efficiency of clustering. Simulation results of one groups of sample are given to illustrate that it can improve the quality of CABOSFV clustering.
Key words:
high dimensional sparse data,
CABOSFV clustering,
sorting
中图分类号:
祝琴, 高学东, 武森, 陈敏, 陈华. 基于排序思想的高维稀疏数据聚类[J]. 计算机工程, 2010, 36(22): 13-14.
CHU Qin, GAO Hua-Dong, WU Sen, CHEN Min, CHEN Hua. High Dimensional Sparse Data Clustering Based on Sorting Idea[J]. Computer Engineering, 2010, 36(22): 13-14.