Gradually Learning Algorithm for Imbalanced Data

doi:10.3969/j.issn.1000-3428.2010.24.058

Computer Engineering ›› 2010, Vol. 36 ›› Issue (24): 161-163.

• Networks and Communications • Previous Articles Next Articles

Gradually Learning Algorithm for Imbalanced Data

DONG Yuan-fang ^1,2a, LI Xiong-fei ¹, LI Jun ^1,2b

(1. Key Laboratory of Symbolic Computation and Knowledge Engineering for Ministry of Education, Jilin University, Changchun 130012, China; 2a. School of Economics and Management; 2b. Dept. of Mathematics, Changchun University of Science and Technology,Changchun 130022, China)

Online:2010-12-20 Published:2010-12-14

一种不平衡数据渐进学习算法

董元方^1,2a，李雄飞¹，李军^1,2b

(1. 吉林大学符号计算与知识工程教育部重点实验室，长春 130012；2. 长春理工大学 a. 经济管理学院；b. 数学系，长春 130022)

作者简介:董元方(1975－)，女，讲师、博士研究生，主研方向：粗糙集理论，数据挖掘；李雄飞，教授、博士生导师；李军，副教授、博士研究生
基金资助:
国家科技支撑计划基金资助项目(2006BAK01A33)；吉林省科技发展计划基金资助项目(20070321, 20090704)

Abstract

Abstract:

For problem of imbalanced data learning, a gradually learning classification algorithm is proposed. This classification algorithm gradually adds the synthetic minority class examples according to attribute value-range distribution, and removes timely the synthetic examples which the stage classifier misclassifies. As the data achieves the desired degree of balance, the method uses raw data and synthetic data training learning algorithm, and gets the final classifier. Experimental results show that the gradually learning algorithm is better than C4.5, and better than SMOTEBoost and DataBoost-IM on most data sets.

Key words: classification, imbalanced data, gradually learning

摘要：

针对不平衡数据学习问题，提出一种采用渐进学习方式的分类算法。根据属性值域分布，逐步添加合成少数类样例，并在阶段分类器出现误分时，及时删除被误分的合成样例。当数据达到预期的平衡程度时，用原始数据和合成数据训练学习算法，得到最终分类器。实验结果表明，该算法优于C4.5算法，并在多数数据集上优于SMOTEBoost和DataBoost-IM。

关键词: 分类, 不平衡数据, 渐进学习

CLC Number:

TP181

DONG Yuan-Fang, LI Xiong-Fei, LI Jun. Gradually Learning Algorithm for Imbalanced Data[J]. Computer Engineering, 2010, 36(24): 161-163.

董元方, 李雄飞, 李军. 一种不平衡数据渐进学习算法[J]. 计算机工程, 2010, 36(24): 161-163.

/ Recommend / Download Citations

URL:

https://www.ecice06.com/EN/Y2010/V36/I24/161

[1]	YIN Zhaoliang, HUANG Yuxin, YU Zhengtao, WANG Guanwen, AI Chuanxian. A Method for Analyzing News Themes Involving Cases with Integrated Crime Classification [J]. Computer Engineering, 2025, 51(4): 208-216.
[2]	ZHANG Heping, FANG Zhijun, LU Junxin, GAO Yongbin. Few-Shot Relation Classification Based on Knowledge-Enhanced Adaptive Prototype Networks [J]. Computer Engineering, 2025, 51(4): 129-136.
[3]	YANG Wangda, WAN Yaping, ZOU Gang, MIN Xiaoshan, WANG Yi, LU Yucheng. Research on Deep Learning Classification Method for Testing Eye Status of Driving Quality Deficiency [J]. Computer Engineering, 2025, 51(2): 149-158.
[4]	ZHANG Heping, ZHANG Hegui, XIE Xiaoyao, ZHANG Taihua, ZHANG Sicong, YU Guojun. Network Embedding Based on k-core Decomposition [J]. Computer Engineering, 2025, 51(2): 139-148.
[5]	MA Hengzhi, QIAN Yurong, LENG Hongyong, WU Haipeng, TAO Wenbin, ZHANG Yiyang. Review of Research Progress on Knowledge Graph Embedding [J]. Computer Engineering, 2025, 51(2): 18-34.
[6]	YAO Lifeng, CAI Manchun, ZHU Yi, CHEN Yonghao, ZHANG Yiwen. Encrypted Traffic Classification Model Based on Byte Coding and Pre-Training Tasks [J]. Computer Engineering, 2025, 51(2): 188-201.
[7]	ZHANG Xinbo, ZHANG Xueying, HUANG Lixia, CHEN Guijun. Classification Algorithm and Application Based on Semi-Supervised Deep Auto-Encoder Network [J]. Computer Engineering, 2025, 51(1): 71-80.
[8]	WANG Xiang, WEI Yuxin, MAO Guojun. A Graph Pooling Method Fusing Multiple Structures and Features of Graph Data [J]. Computer Engineering, 2025, 51(1): 128-137.
[9]	WANG Yanguo, LÜ Pengyuan, LAN Jinjiang, LIU Mingzhe, QIN Guanjun, ZHANG Shuohua, ZHOU Yu. Wind Turbine Fault Classification Method Based on Adversarial Training and Transformer [J]. Computer Engineering, 2024, 50(9): 377-384.
[10]	CAI Junmin, LIANG Zhengyou, SUN Yu, CHEN Ziao. Research on Lightweight Point Cloud Classification Based on Deformable 3D Graph Convolution [J]. Computer Engineering, 2024, 50(9): 255-265.
[11]	LI Junyi, LI Xiangyang, LONG Chaoxun, LI Haiyan, LI Hongsong, YU Pengfei. Wild Mushroom Classification Based on Multi-level Region Selection and Cross-layer Feature Fusion [J]. Computer Engineering, 2024, 50(9): 179-188.
[12]	DANG Xiaochao, LIU Jian, DONG Xiaohui, ZHU Zhongyan, LI Fenfang. Named Entity Recognition of Mechanical Equipment Failure for Imbalanced Data [J]. Computer Engineering, 2024, 50(9): 104-112.
[13]	LI Weigang, LI Xuchang, TIAN Zhiqiang, LI Jinling. Research on Point Cloud Classification and Its Robustness Based on Self-Distillation Framework [J]. Computer Engineering, 2024, 50(9): 72-81.
[14]	Han CHEN, Chunlei ZHAO, Haoda JIANG, Chundong WANG. Research on App User Intent Recognition Based on Fusion Model and Semantic Network [J]. Computer Engineering, 2024, 50(8): 50-63.
[15]	Lai QIAN, Weiwei ZHAO. Text Classification Method Based on Contrastive Learning and Attention Mechanism [J]. Computer Engineering, 2024, 50(7): 104-111.

Please choose a citation manager

Content to export

Gradually Learning Algorithm for Imbalanced Data

一种不平衡数据渐进学习算法

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments

模态框（Modal）标题

Please choose a citation manager

Content to export

Gradually Learning Algorithm for Imbalanced Data

一种不平衡数据渐进学习算法

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments