计算机工程 ›› 2011, Vol. 37 ›› Issue (14): 178-179.doi: 10.3969/j.issn.1000-3428.2011.14.059

• 人工智能及识别技术 • 上一篇    下一篇

Bagging组合的不平衡数据分类方法

秦姣龙,王 蔚   

  1. (南京师范大学教育科学学院机器学习与认知实验室,南京 210097)
  • 收稿日期:2010-12-30 出版日期:2011-07-20 发布日期:2011-07-20
  • 作者简介:秦姣龙(1985-),女,硕士研究生,主研方向:机器学习;王 蔚,教授
  • 基金项目:
    教育部留学回国人员科研启动基金资助项目

Imbalanced Data Classification Method for Bagging Combination

QIN Jiao-long, WANG Wei   

  1. (Machine Learning and Cognition Lab, School of Education Science, Nanjing Normal University, Nanjing 210097, China)
  • Received:2010-12-30 Online:2011-07-20 Published:2011-07-20

摘要: 提出一种基于Bagging组合的不平衡数据分类方法CombineBagging,采用少数类过抽样算法SMOTE进行数据预处理,在此基础上利用C-SVM、径向基函数神经网络、Random Forests 3种不同的基分类器学习算法,分别对采样后的数据样本进行Bagging集成学习,通过投票规则集成学习结果。实验结果表明,该方法能够提高少数类的分类准确率,有效处理不平衡数据分类问题。

关键词: Bagging组合, 不平衡数据分类, 支持向量机, 神经网络, Random Forests算法

Abstract: CombineBagging is designed as a new classification method based on bagging combination for imbalanced data. The main points are as follows: using three different base classifiers learning algorithms, such as C-SVM, Radial Basis Function(RBF) neural network and random forests, to carry out bagging ensemble learning respectively, integrating the three different learning results above into one as the final result by applying voting rule. Experimental results show that CombineBagging method can enhance the minority data’s classification accuracy rate on the five different regions imbalanced data. It is proved that the method can deal with the problem of imbalanced data.

Key words: Bagging combination, imbalanced data classification, Support Vector Machine(SVM), neural network, Random Forests algorithm

中图分类号: