计算机工程 ›› 2020, Vol. 46 ›› Issue (5): 139-143,149.doi: 10.19678/j.issn.1000-3428.0054753

• 先进计算与数据处理 • 上一篇    下一篇

基于动态策略的多源迁移学习数据流分类研究

周胜, 刘三民   

  1. 安徽工程大学 计算机与信息学院, 安徽 芜湖 241000
  • 收稿日期:2019-04-28 修回日期:2019-05-28 发布日期:2019-06-14
  • 作者简介:周胜(1993-),男,硕士研究生,主研方向为机器学习、迁移学习;刘三民,副教授、博士。
  • 基金项目:
    安徽省自然科学基金(1608085MF147)。

Research on Multi-Source Transfer Learning Data Streams Classification Based on Dynamic Strategy

ZHOU Sheng, LIU Sanmin   

  1. College of Computer and Information, Anhui Polytechnic University, Wuhu, Anhui 241000, China
  • Received:2019-04-28 Revised:2019-05-28 Published:2019-06-14

摘要: 为解决数据流分类中的概念漂移和噪声问题,提出一种基于样本确定性的多源迁移学习方法。该方法存储多源领域上由训练得到的分类器,求出各源领域分类器对目标领域数据块中每个样本的类别后验概率和样本确定性值。在此基础上,将样本确定性值满足当前阈值限制的源领域分类器与目标领域分类器进行在线集成,从而将多个源领域的知识迁移到目标领域。实验结果表明,该方法能够有效消除噪声数据流给不确定分类器带来的不利影响,与基于准确率选择集成的多源迁移学习方法相比,具有更高的分类准确率和抗噪稳定性。

关键词: 数据流分类, 多源迁移学习, 类别后验概率, 样本确定性, 集成学习

Abstract: To address concept drift and noise in data stream classification,this paper proposes a multi-source transfer learning method based on sample certainty.First,the method stores classifiers trained in the multi-source domain.Then the method calculates the category posterior probability and sample certainty of each source domain classifier to each sample in the target domain data block.On this basis,the source domain classifiers of which the sample certainty satisfies the current threshold limit are integrated with target domain classifiers online,so as to transfer the knowledge of multi-source domains to the target domain.Experimental results show that the proposed method can effectively eliminate the adverse effects of noisy data streams on uncertain classifiers,and has better classification accuracy and anti-noise stability than the multi-source transfer learning methods based on accuracy selection integration.

Key words: data streams classification, multi-source transfer learning, category posterior probability, sample certainty, ensemble learning

中图分类号: