摘要: 利用高维海量数据点的自身特性和所属类别的唯一性,提出一种改进的无监督分类算法。计算高维点间的互相似度,利用相似性图像处理技术,在每次迭代计算中对数据集进行分割与分类,对数量较少的孤立点进行重分类。实验结果表明,该算法可在没有人工干预的情况下实现高维数据的自适应分类,相比K-means和Isodata算法,所需的计算迭代次数与计算时间较少。
关键词:
高维海量数据,
自适应分类,
相似性,
无监督
Abstract: This paper proposes an improved nonsupervision classification algorithm by using the property of the high-dimensional mass data points and the uniqueness of certain class. The algorithm computes the mutual similarity between points, uses similarity image processing technology to iterate and segment the data set before finding out one class, and accomplishes a re-classification on isolated data points. Experimental results show that the algorithm can realize high-dimensional adaptive classification data with no manual intervention, and it has less computing iterations and time compared with K-means and Isodata algorithm.
Key words:
high-dimensional mass data,
adaptive classification,
similarity,
nonsupervision
中图分类号:
吴永亮, 万旺根, 许雪琼. 高维数据自适应分类研究[J]. 计算机工程, 2010, 36(18): 210-213.
TUN Yong-Liang, MO Wang-Gen, HU Xue-Qiong. [J]. Computer Engineering, 2010, 36(18): 210-213.