摘要: 提出一种基于k-最近邻图的小样本KNN分类算法。通过划分k-最近邻图,形成多个相似度较高的簇,根据簇内已有标记的数据对象来标识同簇中未标记的数据对象,同时剔除原样本集中的噪声数据,从而扩展样本集,利用该新样本集对类标号未知数据对象进行类别标识。采用标准数据集进行测试,结果表明该算法在小样本情况下能够提高KNN的分类精度,减小最近邻阈值k对分类效果的影响。
关键词:
KNN算法,
k-最近邻图,
小样本,
图划分,
分类算法
Abstract: A KNN classification algorithm based on k-nearest neighbor graph for small sample sets is presented to improve the classification accuracy, which partitions the k-nearest neighbor graph into clusters with high similarity, labels the unlabel data of each cluster with the label of the label data in the same cluster, and deletes the noise data. The sample set is expended by this method. The algorithm use the expended sample set to label the unlabel data. The presented algorithm is demonstrated with standard datasets, and the experimental results show the algorithm can enhance the accuracy of classification, reduce the influence of the value of k, and achieve a satisfying result.
Key words:
KNN algorithm,
k-nearest neighbor graph,
small sample,
graph partitioning,
classification algorithm
中图分类号:
刘应东, 牛惠民. 基于k-最近邻图的小样本KNN分类算法[J]. 计算机工程, 2011, 37(9): 198-200.
LIU Ying-Dong, NIU Hui-Min. KNN Classification Algorithm Based on k-Nearest Neighbor Graph for Small Sample[J]. Computer Engineering, 2011, 37(9): 198-200.