Continual Learning Method for Sentiment Classification Based on Knowledge Architecture

doi:10.19678/j.issn.1000-3428.0063536

Abstract

Abstract: When a sentiment classification model learns sentiment classification tasks in multiple domains, the parameters learned from new tasks will modify the original parameters of the model.Because a protection mechanism for the original parameters does not exist, the classification accuracy of the model on old tasks is reduced.To alleviate the catastrophic forgetting of the sentiment classification model and increase knowledge transfer between tasks, this study proposes a Continual Learning(CL) method for sentiment classification based on a knowledge architecture.In the Transformer coding layer, the task self-attention mechanism is used to set the attention transformation matrix for each task separately, and knowledge is retained by distinguishing the task specific attention parameters. In the full connection layer of Convolutional Neural Networks for Sentence Classification(TextCNN), the Hard Attention on Task(HAT) mechanism is used to control the opening and closing of each neuron, train a specific network structure for each task, activate only the neurons important to the task to realize knowledge mining, and improve the classification efficiency and accuracy.Experimental results based on the JD21 Chinese dataset show that the Last Accuracy(Last ACC) and F1-scores of Negative classes(F1-NEG) of this method are 0.37 and 0.09 percentage points higher than those of the HAT-based CL method, respectively, which indicates the higher classification accuracy and effectiveness of the proposed method in mitigating catastrophic forgetting.

Key words: Continual Learning(CL), knowledge architecture, sentiment classification, Knowledge Retention Network(KRN), Knowledge Mining Network(KMN)

摘要： 当情感分类模型依次学习多个领域的情感分类任务时，从新任务中学到的参数会直接修改模型原有参数，由于缺少对原有参数的保护机制，降低了模型在旧任务上的分类准确率。为缓解灾难遗忘现象对模型性能的影响，并增加任务间的知识迁移，提出一种用于中文情感分类的基于知识架构的持续学习方法。在Transformer编码层中，采用任务自注意力机制为每个任务单独设置注意力变换矩阵，通过区分任务特有的注意力参数实现知识保留。在TextCNN的全连接层中，利用任务门控注意力（HAT）机制控制每个神经元的开闭，为每个任务训练特定的网络结构，仅激活对任务重要的神经元加强知识挖掘，提升分类效率与准确率。在JD21中文数据集上的实验结果表明，该方法的Last ACC和负类F1值相比于基于HAT的持续学习方法分别提升了0.37和0.09个百分点，具有更高的分类准确率，并且有效缓解了灾难遗忘现象。

关键词: 持续学习, 知识架构, 情感分类, 知识保留网络, 知识挖掘网络

CLC Number:

TP183

WANG Song, Mairidan Wushouer, Gulanbaier Tuerhong, XUE Yuan. Continual Learning Method for Sentiment Classification Based on Knowledge Architecture[J]. Computer Engineering, 2023, 49(2): 112-118.

王松, 买日旦·吾守尔, 古兰拜尔·吐尔洪, 薛源. 基于知识架构的持续学习情感分类方法[J]. 计算机工程, 2023, 49(2): 112-118.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0063536

http://www.ecice06.com/EN/Y2023/V49/I2/112

Figures/Tables 7

References

[1] THRUN S.Lifelong robot learning[J].Robotics and Autonomous Systems, 1995, 15(1/2):25-46.
[2] DE LANGE M, ALJUNDI R, MASANA M, et al.A continual learning survey:defying forgetting in classification tasks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(7):3366-3385.
[3] CHEN Z Y, MA N Z, LIU B.Lifelong learning for sentiment classification[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2:Short Papers).Stroudsburg, USA:Association for Computational Linguistics, 2015:750-756.
[4] LOMONACO V.Continual learning with deep architectures[D].Bologna, Italy:University of Bologna, 2019.
[5] KE Z X, LIU B, WANG H, et al.Continual learning with knowledge transfer for sentiment classification[M].Berlin, Germany:Springer, 2021.
[6] CHEN Z Y, LIU B.Lifelong machine learning[J].Synthesis Lectures on Artificial Intelligence and Machine Learning, 2016, 10(3):101-145.
[7] BIESIALSKA M, BIESIALSKA K, COSTA-JUSSÀ M R.Continual lifelong learning in natural language processing:a survey[C]//Proceedings of the 28th International Conference on Computational Linguistics.Stroudsburg, USA:International Committee on Computational Linguistics, 2020:6523-6541.
[8] REBUFFI S A, KOLESNIKOV A, SPERL G, et al.iCaRL:incremental classifier and representation learning[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:5533-5542.
[9] LOPEZ-PAZ D, RANZATO M.Gradient episodic memory for continual learning[C]//Proceedings of NIPS'17.Long Beach, USA:Curran Associates Inc., 2017:6470-6479.
[10] D'AUTUME C D M, RUDER S, KONG L, et al.Episodic memory in lifelong language learning[EB/OL].[2021-11-05].https://arxiv.org/abs/1906.01076v1.
[11] SHIN H, LEE J K, KIM J, et al.Continual learning with deep generative replay[EB/OL].[2021-11-05].https://arxiv.org/abs/1705.08690.
[12] SUN F K, HO C H, LEE H Y.LAMOL:language modeling for lifelong language learning[EB/OL].[2021-11-05].https://arxiv.org/abs/1909.03329.
[13] KIRKPATRICK J, PASCANU R, RABINOWITZ N, et al.Overcoming catastrophic forgetting in neural networks[J].Proceedings of the National Academy of Sciences, 2017, 114(13):3521-3526.
[14] LEE S W, KIM J H, JUN J, et al.Overcoming catastrophic forgetting by incremental moment matching[EB/OL].[2021-11-05].https://arxiv.org/abs/1703.08475.
[15] DHAR P, SINGH R V, PENG K C, et al.Learning without memorizing[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Washington D.C., USA:IEEE Press, 2019:5133-5141.
[16] JUNG H, JU J, JUNG M, et al.Less-forgetting learning in deep neural networks[EB/OL].[2021-11-05].https://arxiv.org/abs/1607.00122.
[17] FERNANDO C, BANARSE D, BLUNDELL C, et al.PathNet:evolution channels gradient descent in super neural networks[EB/OL].[2021-11-05].https://arxiv.org/abs/1701.08734.
[18] SERRÀ J, SURÍS D, MIRON M, et al.Overcoming catastrophic forgetting with hard attention to the task[EB/OL].[2021-11-05].https://arxiv.org/abs/1801.01423.
[19] RUSU A A, RABINOWITZ N C, DESJARDINS G, et al.Progressive neural networks[EB/OL].[2021-11-05].https://arxiv.org/abs/1606.04671.
[20] WANG H, LIU B, WANG S, et al.Forward and backward knowledge transfer for sentiment classification[EB/OL].[2021-11-05].https://arxiv.org/abs/1906.03506.
[21] 买日旦∙吾守尔, 古兰拜尔∙吐尔洪, 王松.基于领域注意力机制的终身朴素贝叶斯文本分类方法:CN202110427951.7[P].2021-06-29. Mairidan Wushouer, Gulanbaier Tuerhong, WANG S.Sentiment classification based on lifelong naive Bayesian:CN202110427951.7[P].2021-06-29.(in Chinese)
[22] LÜ G Y, WANG S, LIU B, et al.Sentiment classification by leveraging the shared knowledge from a sequence of domains[M].Berlin, Germany:Springer, 2019.
[23] WANG H, WANG S, MAZUMDER S, et al.Bayes-enhanced lifelong attention networks for sentiment classification[C]//Proceedings of the 28th International Conference on Computational Linguistics.Stroudsburg, USA:International Committee on Computational Linguistics, 2020:580-591.
[24] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is all You need[EB/OL].[2021-11-05].https://arxiv.org/abs/1706.03762.
[25] KIM Y.Convolutional neural networks for sentence classification[EB/OL].[2021-11-05].https://arxiv.org/abs/1408.5882.
[26] TOLSTIKHIN I, HOULSBY N, KOLESNIKOV A, et al.MLP-Mixer:an all-MLP architecture for vision[EB/OL].[2021-11-05].https://arxiv.org/abs/2105.01601.
[27] AHN H, CHA S, LEE D, et al.Uncertainty-based continual learning with adaptive regularization[EB/OL].[2021-11-05].https://arxiv.org/abs/1905.11614.
[28] YANG B, LI J, WONG D, et al.Context-aware self-attention networks[EB/OL].[2021-11-05].https://arxiv.org/abs/1902.05766v1.

Please choose a citation manager

Content to export