摘要: 提出一种基于流形学习的文本分类方法以解决高维文本数据分类问题。利用近邻保持嵌入流形学习算法获得高维Web文本空间中的低维流形结构,采用K近邻分类器对低维流形进行分类。实验结果表明,基于流形学习的方法能获得较好的分类效果,具有稳定的性能。
关键词:
近邻保持嵌入算法,
流形学习,
文本分类,
特征提取,
K近邻
Abstract: To efficiently resolve the high dimensional Web text classification problem, a novel classification algorithm is proposed in this paper on the basis of manifold learning. The algorithm can explore and preserve the inherent structure on high dimensional Web text space, and the classification and predication in the lower dimension feature space are implemented with K-Nearest Neighbor(KNN). Experimental results show that the algorithm achieves higher classification accuracy and stability.
Key words:
Neighborhood Preserving Embedding(NPE) algorithm,
manifold learning,
text classification,
feature extraction,
K-Nearest Neighbor(KNN)
中图分类号:
徐海瑞, 张文生, 吴双. 基于NPE的Web文本分类方法研究[J]. 计算机工程, 2011, 37(17): 133-135.
XU Hai-Rui, ZHANG Wen-Sheng, TUN Shuang. Research of Web Text Classification Method Based on Neighborhood Preserving Embedding[J]. Computer Engineering, 2011, 37(17): 133-135.