计算机工程 ›› 2020, Vol. 46 ›› Issue (1): 15-24,30.doi: 10.19678/j.issn.1000-3428.0055747

• 热点与综述 • 上一篇    下一篇

概念漂移数据流集成分类算法综述

杜诗语, 韩萌, 申明尧, 张春砚, 孙蕊   

  1. 北方民族大学 计算机科学与工程学院, 银川 750021
  • 收稿日期:2019-08-15 修回日期:2019-11-03 出版日期:2020-01-15 发布日期:2019-11-12
  • 作者简介:杜诗语(1996-),女,硕士研究生,主研方向为数据挖掘;韩萌(通信作者),副教授;申明尧、张春砚、孙蕊,硕士研究生。
  • 基金项目:
    国家自然科学基金(61563001);宁夏高等学校一流学科建设项目(NXYKXY2017A07);北方民族大学研究生创新项目(YCX19064)。

Survey of Ensemble Classification Algorithms for Data Streams with Concept Drift

DU Shiyu, HAN Meng, SHEN Mingyao, ZHANG Chunyan, SUN Rui   

  1. School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China
  • Received:2019-08-15 Revised:2019-11-03 Online:2020-01-15 Published:2019-11-12

摘要: 针对概念漂移数据流集成分类算法的基本概念、相关工作、适用范围及优缺点等方面进行具体阐述,重点分析突变型、渐变型、重复型和增量型集成分类算法,以及集成分类中的Bagging、Boosting、基分类器组合学习策略与在线学习、基于块的集成、增量学习关键技术,指出现阶段概念漂移数据流集成分类算法所需解决的主要问题,并对集成基分类器的动态更新与加权组合、多类型概念漂移的快速检测等研究方向进行分析和展望。

关键词: 动态数据流, 集成, 分类, 概念漂移, 增量学习

Abstract: This paper overviews available ensemble classification algorithms for data streams with concept drift,in terms of their basic concepts,related studies,scope of applications and advantages/disadvantages.Among the ensemble classification algorithms,those for sudden drifts,gradual drifts,reoccurring drifts and incremental drifts are analyzed in detail.This paper also focuses on the learning strategy and key techniques used in the algorithms,including Bagging,Boosting,base classifier combination,online learning,block-based ensembles,and incremental learning.Then this paper points out the main problems to be solved by existing ensemble classification algorithms for data streams.At last,this paper analyzes and prospects the directions of further studies,including dynamic updates for ensemble base classifiers,weighted combination of ensemble base classifiers,rapid detection of multi-typed concept drift,etc.

Key words: dynamic data stream, ensemble, classification, concept drift, incremental learning

中图分类号: