作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (5): 62-63,69. doi: 10.3969/j.issn.1000-3428.2012.05.017

• 软件技术与数据库 • 上一篇    下一篇

按需系综的数据流分类算法研究

钱 琳,秦亮曦   

  1. (广西大学计算机与电子信息学院,南宁 530004)
  • 收稿日期:2011-08-22 出版日期:2012-03-05 发布日期:2012-03-05
  • 作者简介:钱 琳(1984-),男,硕士,主研方向:数据挖掘,知识发现;秦亮曦,教授、博士
  • 基金资助:
    “十一五”国家科技支撑计划基金资助项目(2009BAH53B 03)

Study of Data Stream Classification Algorithm with On-demand Ensemble

QIAN Lin, QIN Liang-xi   

  1. (School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China)
  • Received:2011-08-22 Online:2012-03-05 Published:2012-03-05

摘要: 传统分类器系综数据流分类算法内存消耗高、计算开销大。针对该问题,提出一种按需系综分类算法。根据数据流的特点,按需动态调整分类器的个数和权值,从而保持较高分类精度、降低开销。通过对2种人工数据流的实验分析表明,该算法对隐含概念漂移的数据流分类效率及精度都有一定提升,内存开销有所降低。

关键词: 数据流, 按需系综, 概念漂移, 分类器系综

Abstract: Aiming at the problem of high RAM and computation consuming in traditional data stream ensemble classification algorithm, it proposes an on-demand ensemble classification algorithm, which can revises the number of classifier and their weights on demand actively, so as to achieve the purpose of reducing cost while maintaining high classification accuracy. According to the experiments on two synthetic datasets, both classification efficient and accuracy have been improved in hidden concept drifting data streams, while the memory consumption has reduced significantly.

Key words: data stream, on-demand ensemble, concept drifting, classifier ensemble

中图分类号: