Abstract:
For heaps-like, or data sets of large discrepancy of every class specimen number, Fuzzy C-Means(FCM) algorithm and semi-supervised FCM clustering algorithm may not be the right partition of the data, because they have limitation of equal demarcation trend for data set. To solve this problem, this paper using distributing density size of the data dot as weighted value, together with semi-supervised learning, presents a semi-supervised and dot density weighted FCM algorithm. Through semi-supervised learning, it uses simulated annealing method to get the minimized result. Results show that this algorithm can improve the accuracy of the clustering.
Key words:
Fuzzy C-Means(FCM) clustering,
dot density weighted,
semi-supervised learning
摘要: 对于团状、每类样本数相差较大的数据集,FCM算法和半监督模糊C均值聚类算法都不是最佳聚类方法,因为它们对数据集有等划分趋势。针对这种情况,利用样本点分布密度大小作为权值,结合半监督学习方法,提出半监督点密度加权模糊C均值聚类算法。在半监督学习过程中,对于求极值的问题采用模拟退火算法。结果证明,点密度加权模糊C均值聚类算法确实能提高聚类精度。
关键词:
模糊C均值聚类,
点密度加权,
半监督学习
CLC Number:
JIANG Xiu-qin. Semi-supervised and Weighted Fuzzy C-means Clustering Algorithm[J]. Computer Engineering, 2009, 35(17): 170-171,.
江秀勤. 半监督加权模糊C均值聚类算法[J]. 计算机工程, 2009, 35(17): 170-171,.