Abstract:
To improve the classification performance of minority class, this paper combines the advantages of boosting and over-sampling, and presents an over-sampling algorithm based on MCMO-Boost of Adaboost. MCMO-Boost is compared with C4.5, Adaboost and SMOTE, and the results show that MCMO-Boost performs better than others for the classification performance of minority class and the whole data set.
Key words:
Unbalanced data set,
Over-sampling,
Boosting algorithm
摘要: 为了提高不均衡数据集中少数类的分类性能,该文融合了提升和过抽样的优点,提出了基于提升算法Adaboost的过抽样算法MCMO-Boost,并且将其与决策树算法C4.5、提升算法Adaboost和过抽样算法SMOTE进行了实验比较与分析。结果表明,MCMO-Boost算法在少数类和数据集的总体分类性能方面都优于其它算法。
关键词:
不均衡数据集,
过抽样,
提升算法
CLC Number:
HAN Hui; WANG Wenyuan; MAO Binghuan. Over-sampling Algorithm Based on Adaboost in Unbalanced Data Set[J]. Computer Engineering, 2007, 33(10): 207-209.
韩 慧;王文渊;毛炳寰. 不均衡数据集中基于Adaboost的过抽样算法[J]. 计算机工程, 2007, 33(10): 207-209.