作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (24): 153-155. doi: 10.3969/j.issn.1000-3428.2010.24.055

• 人工智能及识别技术 • 上一篇    下一篇

基于抽样的多模态分布聚类算法研究

刘建伟,李双成,罗雄麟   

  1. (中国石油大学自动化研究所,北京 102249)
  • 出版日期:2010-12-20 发布日期:2010-12-14
  • 作者简介:刘建伟(1966-),男,副研究员、博士,主研方向:机器学习;李双成,硕士研究生;罗雄麟,教授、博士

Research of Sampling-based Multi-modal Distribution Clustering Algorithm

LIU Jian-wei, LI Shuang-cheng, LUO Xiong-lin   

  1. (Research Institute of Automation, China University of Petroleum, Beijing 102249, China)
  • Online:2010-12-20 Published:2010-12-14

摘要: 针对处理高维海量数据时聚类算法用时太长的问题,提出基于抽样的多模态分布聚类优化算法,该算法随机地抽取少量样本进行循环校正,减少聚类时间,通过大量实验找出算法的最优配置参数,结果证明,该优化算法以11.8%的聚类运行时间得到了88%的聚类准确性,为高时间成本的应用环境提供了最优的聚类方案。

关键词: 多模态分布聚类, 高时间成本, 最优参数

Abstract: Aiming at the problem of clustering algorithm long time in processing high-dimensional mass data, a sampling-based multi-modal distribution clustering optimization algorithm is proposed, which greatly reduces the clustering time. In iterative correction procession, the algorithm randomly extracts small samples from whole datasets. The optimal configuration parameter of algorithm is found by a large number of experiments. Experimental result shows that the optimization algorithm fulfills 88% of the clustering accuracy with 11.8% of the clustering time, and provides the optimal clustering scheme for the application environment of high time cost.

Key words: multi-modal distribution clustering, high time cost, optimal parameter

中图分类号: