摘要: 提出了一种基于二叉树、预抽取支持向量机及循环迭代算法的改进的支持向量机(SVM)的多类文本分类方法, 与现有的多类分类SVM算法相比,该方法具有较高的计算效率。给出了具体实现过程并将其用于文本分类中,实验表明该算法用于文本分类的有效性及其高效率。
关键词:
文本分类,
支持向量机,
迭代算法,
二叉树
Abstract: This paper puts forward a method of multiclass text categorization based on an improved support vector machine with binary tree and the pre-extracting support vectors and circulated iterative algorithm. Compared with existing multiclass classification support vector machines methods, the present method possesses much higher computation efficiency. It gives the concrete procedure of the algorithm, and applies it to the text classification. Experimental results demonstrate the effectiveness and the efficiency of the approach.
Key words:
Text categorization,
Support vector machines,
Iterative algorithm,
Binary tree
中图分类号:
应 伟;王正欧;安金龙. 一种基于改进的支持向量机的多类文本分类方法[J]. 计算机工程, 2006, 32(16): 74-76.
YING Wei; WANG Zheng’ou;AN Jinlong. Study on Multiclass Text Categorization Method Based on Improved Support Vector Machine[J]. Computer Engineering, 2006, 32(16): 74-76.