作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (08): 205-207. doi: 10.3969/j.issn.1000-3428.2007.08.072

• 人工智能及识别技术 • 上一篇    下一篇

大样本情况下的一种新的SVM迭代算法

田新梅,吴秀清,刘 莉   

  1. (中国科学技术大学电子工程与信息科学系,合肥 230027)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-04-20 发布日期:2007-04-20

A New SVM Iterative Algorithm in Large Training Set

TIAN Xinmei, WU Xiuqing, LIU Li   

  1. (Dept. of Electronic Eng. & Information Science, University of Science and Technology of China, Hefei 230027)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-04-20 Published:2007-04-20

摘要: 针对SVM方法在大样本情况下学习和分类速度慢的问题,提出了大样本情况下的一种新的SVM迭代训练算法。该算法利用K均值聚类算法对训练样本集进行压缩,将聚类中心作为初始训练样本集,减少了样本间的冗余,提高了学习速度。同时为了保证学习的精度,采用往初始训练样本集中加入边界样本和错分样本的策略来更新训练样本集,迭代训练直到错分样本数目不变为止。该文提出的基于K均值聚类的SVM迭代算法能在保持学习精度的同时,减小训练样本集及决策函数的支持向量集的规模,从而提高学习和分类的速度。

关键词: 支持向量机, 机器学习, K均值聚类算法, 迭代算法

Abstract: A new SVM iterative algorithm is proposed, aiming at the problem that the speeds of learning and classifying are slow in large training set. K-mean clustering algorithm is used to get the original training set, the clustering center is considered to be the original training set, and so reduces the redundance of samples. The margined samples and error-classifying samples are joined in the original training set to renew it. Iterative training is done till the error-classifying samples’ number is not changed. The experiments show that, by this iterative algorithm, the scales of training data and the support vector set are effectively compressed. So the speeds of learning and classifying are accelerated while keeping the same learning precision.

Key words: Support vector machine(SVM), Machine learning, K-mean clustering algorithm, Iterative algorithm

中图分类号: