作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

基于模糊C-均值的空间不确定数据聚类

肖宇鹏,何云斌,万静,李松   

  1. (哈尔滨理工大学计算机科学与技术学院,哈尔滨 150080)
  • 收稿日期:2014-09-24 出版日期:2015-10-15 发布日期:2015-10-15
  • 作者简介:肖宇鹏(1986-),男,硕士,主研方向:空间数据挖掘;何云斌(通讯作者),教授;万静,教授、博士;李松,副教授、博士。
  • 基金资助:
    黑龙江省自然科学基金资助项目(F201014,F201134,F201302);黑龙江省教育厅科学技术研究基金资助项目(12531120,12541128,12511100)。

Clustering of Space Uncertain Data Based on Fuzzy C-means

XIAO Yupeng,HE Yunbin,WAN Jing,LI Song   

  1. (School of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,China)
  • Received:2014-09-24 Online:2015-10-15 Published:2015-10-15

摘要: 针对现实世界中样本对象的不确定性及样本对象间界限划分的模糊性,提出基于模糊C-均值的空间不确定数据聚类算法UFCM。但由于UFCM算法在聚类过程中涉及大量期望距离的复杂积分计算,导致UFCM算法性能不理想,进而给出改进算法I_UFCM,将空间不确定对象聚类问题转化为传统的确定对象聚类问题,采用相似度计算公式减少期望距离的计算量,提高聚类结果的质量。实验结果表明,与UFCM和UK-Means算法相比,I_UFCM算法在空间不确定数据集上具有更好的聚类性能,CUP耗时降低了90%以上。

关键词: 模糊C-均值, 不确定数据, 概率密度函数, 期望距离, 质心

Abstract: Aiming at the uncertainty of sample object in real world and the fuzzy boundary between sample objects,this paper proposes a Uncertain Fuzzy C-Means(UFCM) algorithm.Because of a lot of complex integral calculation in expected distance computation,UFCM algorithm is inefficiency.Further,an improved algorithm called I_UFCM is proposed.In this algorithm,the spatial uncertain objects are transformed into the traditional certain objects for clustering.Besides,a new formula for calculation similarity is introduced instead of traditional Euclidean norm to evaluate the distance between objects.The quality of clustering results is improved by reducing the computational amount of excepted distance.Experimental results demonstrate the clustering performance of I_UFCM algorithm is more effective than UFCM and UK-Means algorithm,and its CPU time is reduced by 90%.

Key words: fuzzy C-means, uncertain data, probability density function, excepted distance, centroid

中图分类号: