作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (09): 196-198.

• 人工智能及识别技术 • 上一篇    下一篇

用于文本分类的多核SVM算法研究

陈莲娜,姚伏天   

  1. (中国计量学院计算机科学系,杭州 310018)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-05-05 发布日期:2007-05-05

Algorithm Research on Multiple Kernel Learning SVM
for Text Classification

CHEN Lianna, YAO Futian   

  1. (Department of Computer Science, China Jiliang University, Hangzhou 310018)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-05-05 Published:2007-05-05

摘要: 根据文本分类通常包含多异类数据源的特点,提出了多核SVM学习算法。该算法将分类核矩阵的二次组合重新表述成半无限规划,并说明其可以通过重复利用SVM来实现有效求解。实验结果表明,提出的算法可以用于数百个核的结合或者是数十万个样本的结合,对于多异类数据源的文本分类具有较高的查全率和查准率。

关键词: 文本分类, SVM, 多核学习

Abstract: According to the feature of text classification which often involves multiple, heterogeneous data sources, this paper puts forward the algorithm of multiple kernel learning. It considers that conic combinations of kernel matrices for classification leads to a convex quadratically constraint quadratic program, and it can be efficiently solved by recycling the standard SVM implementations. Experimental results show that the proposed algorithm works for hundred thousands of examples or hundreds of kernels to be combined, and it has higher recall rate and higher precision rate for classification of text email with multiple, heterogeneous data sources.

Key words: Text classification, SVM, Multiple kernel learning

中图分类号: