摘要: 针对主动挖掘和被动挖掘2种典型分类方法的特点,分析实际问题中数据流的基本变化类型及衍生的各种变化情况,证明主动挖掘方法在许多情况下无法有效工作,给出一个有效检测数据流变化的思路。采用主动学习方法,利用有限的资源可以组织高质量的类标数据,降低训练数据的需求量。
关键词:
数据流,
概念漂移,
分布变化,
主动挖掘,
被动挖掘
Abstract: Aiming at the characteristics of active mining and passive mining, this paper analyzes two basic types of change and possible combination ones in a real-world data stream, demonstrates that the active mining method does not work in most situations. It offers an effective framework for detecting the changes in data streams. By using active learning method, it employs limited resources to organize high-quality data, and reduces the training data.
Key words:
data stream,
concept-drift,
distribution change,
active mining,
passive mining
中图分类号:
黄树成, 朱霞. 数据流分类变化的分析和检测[J]. 计算机工程, 2011, 37(4): 78-80.
HUANG Shu-Cheng, SHU Xia. Analysis and Detection of Changes in Data Stream Classification[J]. Computer Engineering, 2011, 37(4): 78-80.