基于动态策略的多源迁移学习数据流分类研究

doi:10.19678/j.issn.1000-3428.0054753

摘要/Abstract

摘要： 为解决数据流分类中的概念漂移和噪声问题，提出一种基于样本确定性的多源迁移学习方法。该方法存储多源领域上由训练得到的分类器，求出各源领域分类器对目标领域数据块中每个样本的类别后验概率和样本确定性值。在此基础上，将样本确定性值满足当前阈值限制的源领域分类器与目标领域分类器进行在线集成，从而将多个源领域的知识迁移到目标领域。实验结果表明，该方法能够有效消除噪声数据流给不确定分类器带来的不利影响，与基于准确率选择集成的多源迁移学习方法相比，具有更高的分类准确率和抗噪稳定性。

关键词: 数据流分类, 多源迁移学习, 类别后验概率, 样本确定性, 集成学习

Abstract: To address concept drift and noise in data stream classification,this paper proposes a multi-source transfer learning method based on sample certainty.First,the method stores classifiers trained in the multi-source domain.Then the method calculates the category posterior probability and sample certainty of each source domain classifier to each sample in the target domain data block.On this basis,the source domain classifiers of which the sample certainty satisfies the current threshold limit are integrated with target domain classifiers online,so as to transfer the knowledge of multi-source domains to the target domain.Experimental results show that the proposed method can effectively eliminate the adverse effects of noisy data streams on uncertain classifiers,and has better classification accuracy and anti-noise stability than the multi-source transfer learning methods based on accuracy selection integration.

Key words: data streams classification, multi-source transfer learning, category posterior probability, sample certainty, ensemble learning

中图分类号:

TP181

周胜, 刘三民. 基于动态策略的多源迁移学习数据流分类研究[J]. 计算机工程, 2020, 46(5): 139-143,149.

ZHOU Sheng, LIU Sanmin. Research on Multi-Source Transfer Learning Data Streams Classification Based on Dynamic Strategy[J]. Computer Engineering, 2020, 46(5): 139-143,149.

https://www.ecice06.com/CN/Y2020/V46/I5/139

图/表 7

20200513201932

20200513201936

20200513201945

20200513201949

20200513201953

20200513201957

20200513202001

参考文献

[1] SUN Shiliang,SHI Honglei,WU Yuanbin.A survey of multi-source domain adaptation[J].Information Fusion,2015,24:84-92.
[2] JI Dingcheng,JIANG Yizhang,WANG Shitong.Multi-source transfer learning method by balancing both the domains and instances[J].Acta Electronica Sinica,2019,47(3):692-699.(in Chinese)季鼎承,蒋亦樟,王士同.基于域与样例平衡的多源迁移学习方法[J].电子学报,2019,47(3):692-699.
[3] KRAWCZYK B.Active and adaptive ensemble learning for online activity recognition from data streams[J].Knowledge-Based Systems,2017,138(15):69-78.
[4] HOSSEINI M J,GHOLIPOUR A,BEIGY H.An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams[J].Knowledge and Information Systems,2016,46(3):567-597.
[5] SHU Xing,YU Huimin,ZHENG Weiwei,et al.Classifier-designing algorithm on a small dataset based on margin Fisher criterion and transfer learning[J].Acta Automatica Sinica,2016,42(9):1313-1321.(in Chinese)舒醒,于慧敏,郑伟伟,等.基于边际Fisher准则和迁移学习的小样本集分类器设计算法[J].自动化学报,2016,42(9):1313-1321.
[6] AGHAMALEKI J A,BAHARLOU S M.Transfer learning approach for classification and noise reduction on noisy Web data[J].Expert Systems with Applications,2018,105:221-232.
[7] HANG Wenlong,JIANG Yizhang,LIU Jiefang,et al.Transfer affinity propagation clustering algorithm[J].Journal of Software,2016,27(11):2796-2813.(in Chinese)杭文龙,蒋亦樟,刘解放,等.迁移近邻传播聚类算法[J].软件学报,2016,27(11):2796-2813.
[8] GE Liang,GAO Jing,ZHANG Aidong.OMS-TL:a framework of online multiple source transfer learning[C]//Proceedings of the 22nd ACM International Conference on Information and Knowledge Management.New York,USA:ACM Press,2013:2423-2428.
[9] WU Qingyao,ZHOU Xiaoming,YAN Yuguang,et al.Online transfer learning by leveraging multiple source domains[J].Knowledge and Information Systems,2017,52(3):1-21.
[10] WU Qingyao,WU Hanrui,ZHOU Xiaoming,et al.Online transfer learning with multiple homogeneous or heterogeneous sources[J].IEEE Transactions on Knowledge and Data Engineering,2017,29(7):1494-1507.
[11] TANG Shiqi,WEN Yimin,QIN Yixiu,et al.Online transfer learning from multiple sources based on local classification accuracy[J].Journal of Software,2017,28(11):2940-2960.(in Chinese)唐诗淇,文益民,秦一休,等.一种基于局部分类精度的多源在线迁移学习算法[J].软件学报,2017,28(11):2940-2960.
[12] WEN Yimin,TANG Shiqi,FENG Chao,et al.Online transfer learning for mining recurring concept in data stream classification[J].Journal of Computer Research and Development,2016,53(8):1781-1791.(in Chinese)文益民,唐诗淇,冯超,等.基于在线迁移学习的重现概念漂移数据流分类[J].计算机研究与发展,2016,53(8):1781-1791.
[13] BHATT H S,RAJKUMAR A,ROY S.Multi-source iterative adaptation for cross-domain classification[C]//Proceedings of International Joint Conference on Artificial Intelligence.New York,USA:ACM Press,2016:3691-3697.
[14] BIAN Zekang,WANG Shitong.Similarity-learning based multi-source transfer learning algorithm[J].Control and Decision,2017,32(11):1941-1948.(in Chinese)卞则康,王士同.基于相似度学习的多源迁移算法[J].控制与决策,2017,32(11):1941-1948.
[15] YAN Yuguang,WU Qingyao,TAN Mingkui,et al.Online heterogeneous transfer learning by weighted offline and online classifiers[C]//Proceedings of European Conference on Computer Vision.Berlin,Germany:Springer,2016:467-474.
[16] QIN Yixiu,WEN Yimin,HE Qian.Multi-source online transfer learning algorithm for classification of data streams with concept drift[J].Computer Science,2019,46(1):64-72.(in Chinese)秦一休,文益民,何倩.概念漂移数据流分类中的多源在线迁移学习算法[J].计算机科学,2019,46(1):64-72.
[17] BLASZCZYNSKI J,STEFANOWSKI J,ZAJAC M.Ensembles of abstaining classifiers based on rule sets[C]//Proceedings of International Symposium on Methodologies for Intelligent Systems.Berlin,Germany:Springer,2009:382-391.
[18] PIETRASZEK T.On the use of ROC analysis for the optimization of abstaining classifiers[J].Machine Learning,2007,68(2):137-169.
[19] PIETRASZEK T.Classification of intrusion detection alerts using abstaining classifiers[J].Intelligent Data Analysis,2007,11(3):293-316.
[20] HOLMES G,KIRKBY R,PFAHRINGER B.MOA:massive online analysis[EB/OL].[2019-03-25].http://sourceforge.net/projects/moa-datastream.
[21] HULTEN G,SPENCER L,DOMINGOS P.Mining time-changing data streams[C]//Proceedings of International Conference on Knowledge Discovery and Data Mining.New York,USA:ACM Press,2001:97-106.

选择文件类型/文献管理软件名称

选择包含的内容