作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (15): 38-40. doi: 10.3969/j.issn.1000-3428.2009.15.013

• 软件技术与数据库 • 上一篇    下一篇

基于流形学习和SVM的Web文档分类算法

王自强,钱 旭   

  1. (中国矿业大学(北京)机电与信息工程学院,北京 100083)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-08-05 发布日期:2009-08-05

Web Document Classification Algorithm Based on Manifold Learning and SVM

WANG Zi-qiang, QIAN Xu   

  1. (College of Mechanical Electronic and Information Engineering, China University of Mining and Technology(Beijing), Beijing 100083)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-08-05 Published:2009-08-05

摘要: 为解决Web文档分类问题,提出一种基于流形学习和SVM的Web文档分类算法。该算法利用流形学习算法LPP对训练集中的高维Web文档空间进行非线性降维,从中找出隐藏在高维观测数据中有意义的低维结构,在降维后的低维特征空间中利用乘性更新规则的优化SVM进行分类预测。实验结果表明该算法以较少的运行时间获得更高的分类准确率。

关键词: 文档分类, 流形学习, 支持向量机

Abstract: To efficiently resolve Web document classification problem, a novel Web document classification algorithm based on manifold learning and Support Vector Machine(SVM) is proposed. The high dimensional Web document space in the training sets are non-linearly reduced to lower dimensional space with manifold learning algorithm LPP, and the hidden interesting lower dimensional structure can be discovered from the high dimensional observisional data. The classification and predication in the lower dimensional feature space are implemented with the multiplicative update-based optimal SVM. Experimental results show that the algorithm achieves higher classification accuracy with less running time.

Key words: document classification, manifold learning, Support Vector Machine(SVM)

中图分类号: