Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2022, Vol. 48 ›› Issue (4): 99-105,112. doi: 10.19678/j.issn.1000-3428.0061001

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Imbalanced Data Classification Based on Ensemble Weighted Broad Learning System with AdaBoost

WANG Mengduo, XU Xinying, YAN Gaowei, SHI Lijuan, GUO Lei   

  1. School of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China
  • Received:2021-03-03 Revised:2021-05-08 Published:2021-05-17

基于AdaBoost集成加权宽度学习系统的不平衡数据分类

王萌铎, 续欣莹, 阎高伟, 史丽娟, 郭磊   

  1. 太原理工大学 电气与动力工程学院, 太原 030024
  • 作者简介:王萌铎(1996—),男,硕士研究生,主研方向为多模态融合、宽度学习系统;续欣莹,教授、博士;阎高伟,教授、博士、博士生导师;史丽娟,硕士研究生;郭磊,博士研究生。
  • 基金资助:
    国家自然科学基金面上项目(61973226);山西省自然科学基金(201801D121144)。

Abstract: Broad Learning System(BLS) is a novel shallow network structure having advantages such as rapid training and incremental learning. When dealing with imbalanced data, BLS extracts fewer minority class features, which can reduce the performance of the these classes. To solve this problem, this study proposes an imbalanced data classification method based on the ensemble Weighted Broad Learning System(WBLS) with AdaBoost(AdaBoost-WBLS) to improve the recognition ability of minority classes through dynamic updating of weights, to better match the characteristics of the data. Based on the KKT condition, the weighting optimization process of WBLS is derived theoretically to verify the inhibition effect of the diagonal weights on BLS errors. The initialization of AdaBoost-WBLS is based on category information, which can increase the ensemble training efficiency of the model. In the process of weight updating, different regularized updating modes are adopted according to the different data categories, not only to retain the features within the classes but also to increase the degree of distinction between the classes. In this study, many experiments are carried out on the AdaBoost-WBLS model with the parameters of different data optimized in a limited range. The experimental results show that, compared with both AdaBoost- and BLS-related models, the AdaBoost-WBLS model improves the extraction feature ability of minority classes. On the Satimage dataset, the G-mean of the AdaBoost-WBLS model is 4.36 percentage points higher than that of the Weighted Minority Oversampling Deep Auto-encoder(WMODA) model, which shows that the recognition ability of the AdaBoost-WBLS model for imbalanced data is significantly improved.

Key words: Broad Learning System(BLS), AdaBoost model, imbalanced data, Weighted Broad Learning System(WBLS), ensemble learning

摘要: 宽度学习系统(BLS)是一种浅层的神经网络结构,具有快速训练、增量学习等特征,在处理类别不平衡数据时提取到的少数类别特征较少,导致识别结果不理想。提出一种基于AdaBoost集成加权宽度学习系统(AdaBoost-WBLS)的不平衡数据分类方法,通过迭代实现权重的动态更新,获得更符合数据特征的权重,提升集成模型对少数类的识别能力。基于KKT条件,对加权宽度学习系统的加权优化过程进行推导,验证了对角权重对BLS模型误差的抑制作用。在AdaBoost-WBLS模型集成初始化时,采用基于类别信息的初始化权值策略,使模型具有更高的集成训练效率。在集成权重更新时,不同数据类别采用不同的正则化更新方式,保留数据的类内特征并增加类间区分度。在实验过程中,对AdaBoost-WBLS模型的不同参数进行寻优,得到相关参数在有限范围内的最优取值。实验结果表明,AdaBoost-WBLS模型相比AdaBoost和BLS类相关模型能有效改善少数类别特征的提取能力,并且在Satimage数据集上相比加权过采样的深度自编码器模型的G-mean高出4.36个百分点,明显提升了不平衡数据的识别能力。

关键词: 宽度学习系统, AdaBoost模型, 不平衡数据, 加权宽度学习系统, 集成学习

CLC Number: