作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2011, Vol. 37 ›› Issue (5): 74-76. doi: 10.3969/j.issn.1000-3428.2011.05.025

• 软件技术与数据库 • 上一篇    下一篇

自适应概念漂移的在线集成分类器

王黎明,周 驰   

  1. (郑州大学信息工程学院,郑州 450001)
  • 出版日期:2011-03-05 发布日期:2012-10-31
  • 作者简介:王黎明(1963-),男,教授、博士,主研方向:分布式数据挖掘;周 驰,硕士研究生

Online Ensemble Classifier for Adaptive Concept Drift

WANG Li-ming ,ZHOU Chi   

  1. (School of Information Engineering, Zhengzhou Unversity, Zhengzhou 450001, China)
  • Online:2011-03-05 Published:2012-10-31

摘要: 数据流挖掘要求算法能快速地响应、占用少量内存和自适应概念漂移。根据以上要求提出一种自适应概念漂移的基于Hoeffding树在线Bagging分类算法。利用统计学理论,检验分类模型在自适应窗口内数据的分类精度是否落入真实错误率的单侧置信区间,由检测结果决定更新Hoeffding树或重建新Hoeffding树。实验结果表明,该算法在处理带有概念漂移的数据流上表现出较高的分类精度。

关键词: 数据流, 概念漂移, Hoeffding 树, 在线Bagging

Abstract: Mining data streams require algorithms that make fast response, make light demands on memory resources and are easily to adapt to concept drift. This paper proposes a new algorithm for data streaming mining with concept drift called AHBag, which is based on Hoeffding tree online Bagging ensemble. The algorithm tests data within an adaptive window using the statistical theory for capturing the concept drift. According to the test results to update Hoeffding tree or rebuild a new Hoeffding trees. Experimental results show that the algorithm has a highly accuracy in dealing with data streams with concept drift.

Key words: data stream, concept drift, Hoeffding tree, online Bagging

中图分类号: