计算机工程 ›› 2018, Vol. 44 ›› Issue (10): 175-181,189.doi: 10.19678/j.issn.1000-3428.0048566

• 人工智能及识别技术 • 上一篇    下一篇

基于一般分布区间数的不确定EFCM-ID聚类算法

毛伊敏1a,王嘉炜1a,卢欣荣1b,毛丁慧2   

  1. 1.江西理工大学 a.信息工程学院; b.应用科学学院,江西 赣州 341000; 2.中陕核工业集团二一一大队有限公司,西安 710024
  • 收稿日期:2017-09-06 出版日期:2018-10-15 发布日期:2018-10-15
  • 作者简介:毛伊敏(1970—),女,教授、博士,主研方向为数据挖掘;王嘉炜,硕士研究生;卢欣荣,讲师;毛丁慧,工程师。
  • 基金项目:
    国家自然科学基金(41562019);国家自然科学基金重点项目(41530640);江西省自然科学基金(20161BAB203093);江西省教育厅科技项目(GJJ151531)。

Uncertain EFCM-ID Clustering Algorithm Based on General Distributed Interval Number

MAO Yimin1a,WANG Jiawei1a,LU Xinrong1b,MAO Dinghui2   

  1. 1a.School of Information Engineering; 1b.School of Applied Science,Jiangxi University of Science and Technology, Ganzhou,Jiangxi 341000,China; 2.211 Brigade Co.,Ltd. of Sino Shanxi Nuclear Industry Group,Xi’an 710024,China
  • Received:2017-09-06 Online:2018-10-15 Published:2018-10-15

摘要: 在基于模糊C-均值(FCM)的不确定区间数聚类算法中,区间数内的点通常被假设服从均匀分布而难以表达其真实属性,聚类结果受初始聚类中心影响较大且隶属度更新速度较慢。为此,提出一种基于一般分布区间数的不确定高效区间数模糊(EFCM-ID)聚类算法。基于四分位数思想设计适用于一般分布区间数的距离度量——MQ距离,准确刻画不确定数据。结合密度思想和随机抽样策略提出初始聚类中心的优化选取方法SDCS,提升算法精度。在此基础上,利用竞争学习思想构建相对加速隶属度更新策略,减少算法的运行时间。实验结果表明,与YFCM、XFCM和ExpFCMd-ID算法相比,该算法具有较好的稳定性,并且聚类效率更高。

关键词: 不确定聚类, 区间数, 模糊C-均值, 密度思想, 竞争学习思想

Abstract: The uncertain interval number clustering algorithm based on Fuzzy C-Means (FCM) has the problem that the point in the interval number is usually assumed to be the uniform distribution and difficult to express its real attribute,the clustering results are greatly affected by the initial clustering center and the updating speed of membership degree is slow.Therefore,an uncertain Efficient Fuzzy C-Means for Interval-valued Data(EFCM-ID) clustering algorithm based on general distributed interval number is proposed.Based on the idea of quartile,the distance metric MQ distance for general distributed interval numbers is designed,and the uncertain data are accurately described.Combining the density idea and random sampling strategy,an optimal selection method SDCS of initial clustering center is proposed to improve the accuracy of the algorithm,and then a relative accelerated membership updating strategy is constructed by using the competitive learning theory,which reduces the running time of the algorithm.Experimental results show that,compared with YFCM,XFCM and ExpFCMd-ID algorithm,this algorithm has better stability and higher clustering efficiency.

Key words: uncertain clustering, interval number, Fuzzy C-Means (FCM), density thought, competitive learning thought

中图分类号: