Abstract:
A Chinese Blog topic classification method based on approximate minimum enclosing ball is proposed. By transforming the optimization problem of the Support Vector Machine(SVM) to the optimization problem of approximate minimum enclosing ball equivalently, the Blog topic classifier can be trained quickly by only selecting a core subset of the original large scale dataset. The feature selection experiments and topic classification experiments are executed on large scale Blog dataset. Experimental results show that the method can provide good classification precise and quick run-time speed.
Key words:
Blog classification,
approximate minimum enclosing ball,
Support Vector Machine(SVM),
core vector machine,
data minin,
new media
摘要: 提出一种基于近似最小闭包球原理的中文博客(Blog)话题分类方法。根据近似最小闭包球原理,将支持向量机的优化求解转换为近似最小闭包球求解,使得只需选择大规模数据集的一个核心子集参与分类器的训练过程,以提高Blog话题分类中大规模训练集的处理能力。在较大规模的Blog数据集上进行中文Blog特征选择及话题分类实验。实验结果表明,该方法不仅准确率可达到支持向量机同等的效果,且可减少训练时间,获得较好的Blog话题分类效果。
关键词:
博客分类,
近似最小闭包球,
支持向量机,
核心向量机,
数据挖掘,
新兴媒体
CLC Number:
FU Xiang-Hua, GUO Wu-Biao, LIU Guo, WANG Zhi-Jiang. Chinese Blog Classification Based on Minimum Enclosing Ball[J]. Computer Engineering, 2012, 38(23): 162-165.
傅向华, 郭武彪, 刘国, 王志强. 基于最小闭包球的中文博客分类[J]. 计算机工程, 2012, 38(23): 162-165.