基于改进BiGRU-CNN的中文文本分类方法

doi:10.19678/j.issn.1000-3428.0061176

摘要/Abstract

摘要： 传统的自注意力机制可以在保留原始特征的基础上突出文本的关键特征，得到更准确的文本特征向量表示，但忽视了输入序列中各位置的文本向量对输出结果的贡献度不同，导致在权重分配上存在偏离实际的情况，而双向门控循环单元（BiGRU）网络在对全局信息的捕捉上具有优势，但未考虑到文本间存在的局部依赖关系。针对上述问题，提出一种基于改进自注意力机制的BiGRU和多通道卷积神经网络（CNN）文本分类模型SAttBiGRU-MCNN。通过BiGRU对文本序列的全局信息进行捕捉，得到文本的上下文语义信息，利用优化的多通道CNN提取局部特征，弥补BiGRU忽视局部特征的不足，在此基础上对传统的自注意力机制进行改进，引入位置权重参数，根据文本向量训练的位置，对计算得到的自注意力权重概率值进行重新分配，并采用softmax得到样本标签的分类结果。在两个标准数据集上的实验结果表明，该模型准确率分别达到98.95%和88.1%，相比FastText、CNN、RCNN等分类模型，最高提升了8.99、7.31个百分点，同时精确率、召回率和F1值都有较好表现，取得了更好的文本分类效果。

关键词: 自注意力机制, 双向门控循环单元, 多通道卷积神经网络, 文本分类, 深度学习

Abstract: The conventional self-attention mechanism can highlight the key features of text while retaining the original features to obtain a more accurate representation of the text feature vector.However, it does not focus on the different contributions of the text vector at each position in the input sequence to the output result, resulting in deviation from the actual situation in weight allocation.The Bidirectional Gated Recurrent Unit(BiGRU) network can capture global information but disregards local features.Hence, a multi-channel Convolutional Neural Network(CNN) with improved self-attention mechanism(SAttBiGRU-MCNN) is proposed herein.The BiGRU is used to capture global information regarding the text sequence to obtain the context semantic information of the text.An optimized multi-channel CNN is used to extract the local features, thereby compensating for the deficiency of the BiGRU, which disregards the local features.Subsequently, the conventional self-attention mechanism is improved, the position weight parameter is introduced, and the calculated self-attention weight probability value is redistributed based on the position of text vector training;subsequently, the classification results of sample labels are obtained using softmax.Experimental results on two standard datasets show that the accuracy of the model reaches 98.95% and 88.1%, separately, which are up to 8.99 and 7.31 percentage points higher, respectively, than those of classification models such as FastText, CNN, and RCNN. Finally, the proposed model performs better in terms of accuracy, recall, and F1 value, as well asoffers better text classification.

Key words: self-attention mechanism, Bidirectional Gated Recurrent Unit(BiGRU), multi-channel Convolutional Neural Network(CNN), text classification, deep learning

中图分类号:

TP391.1

陈可嘉, 刘惠. 基于改进BiGRU-CNN的中文文本分类方法[J]. 计算机工程, 2022, 48(5): 59-66,73.

CHEN Kejia, LIU Hui. Chinese Text Classification Method Based on Improved BiGRU-CNN[J]. Computer Engineering, 2022, 48(5): 59-66,73.

https://www.ecice06.com/CN/Y2022/V48/I5/59

图/表 7

20220723173116

20220723173121

20220723173125

20220723173129

20220723173133

20220723173136

20220723173140

参考文献

[1] BOURAS C, TSOGKAS V.Improving news articles recommendations via user clustering[J].International Journal of Machine Learning and Cybernetics, 2017, 8(1):223-237.
[2] 熊回香, 杨梦婷, 李玉媛.基于深度学习的信息组织与检索研究综述[J].情报科学, 2020, 38(3):3-10. XIONG H X, YANG M T, LI Y Y.A survey of information organization and retrieval based on deep learning[J].Information Science, 2020, 38(3):3-10.(in Chinese)
[3] LIU Y, LAPATA M.Learning structured text representations[J].Transactions of the Association for Computational Linguistics, 2018, 6:63-75.
[4] 王芝辉, 王晓东.基于神经网络的文本分类方法研究[J].计算机工程, 2020, 46(3):11-17. WANG Z H, WANG X D.Research on text classification methods based on neural network[J].Computer Engineering, 2020, 46(3):11-17.(in Chinese)
[5] 韩栋, 王春华, 肖敏.基于句子级学习改进CNN的短文本分类方法[J].计算机工程与设计, 2019, 40(1):256-260, 284. HAN D, WANG C H, XIAO M.Improved CNN based on sentence-level supervised learning for short text classification[J].Computer Engineering and Design, 2019, 40(1):256-260, 284.(in Chinese)
[6] 曾凡锋, 李玉珂, 肖珂.基于卷积神经网络的语句级新闻分类算法[J].计算机工程与设计, 2020, 41(4):978-982. ZENG F F, LI Y K, XIAO K.Sentence-level fine-grained news classification based on convolutional neural network[J].Computer Engineering and Design, 2020, 41(4):978-982.(in Chinese)
[7] 邱宁佳, 丛琳, 周思丞, 等.结合改进主动学习的SVD-CNN弹幕文本分类算法[J].计算机应用, 2019, 39(3):644-650. QIU N J, CONG L, ZHOU S C, et al.SVD-CNN barrage text classification algorithm combined with improved active learning[J].Journal of Computer Applications, 2019, 39(3):644-650.(in Chinese)
[8] 陈珂, 梁斌, 柯文德, 等.基于多通道卷积神经网络的中文微博情感分析[J].计算机研究与发展, 2018, 55(5):945-957. CHEN K, LIANG B, KE W D, et al.Chinese micro-blog sentiment analysis based on multi-channels convolutional neural networks[J].Journal of Computer Research and Development, 2018, 55(5):945-957.(in Chinese)
[9] PARWEZ M A, JAHIRUDDIN A M.Multi-label classification of microblogging texts using convolution neural network[J].IEEE Access, 2019, 7:68678-68691.
[10] LI Q, LI P F, MAO K Z, et al.Improving convolutional neural network for text classification by recursive data pruning[J].Neurocomputing, 2020, 414:143-152.
[11] WANG J Y, LI Y X, SHAN J, et al.Large-scale text classification using scope-based convolutional neural network:a deep learning approach[J].IEEE Access, 2019, 7:171548-171558.
[12] 王伟, 孙玉霞, 齐庆杰, 等.基于BiGRU-attention神经网络的文本情感分类模型[J].计算机应用研究, 2019, 36(12):3558-3564. WANG W, SUN Y X, QI Q J, et al.Text sentiment classification model based on BiGRU-attention neural network[J].Application Research of Computers, 2019, 36(12):3558-3564.(in Chinese)
[13] 郑诚, 薛满意, 洪彤彤, 等.用于短文本分类的DC-BiGRU_CNN模型[J].计算机科学, 2019, 46(11):186-192. ZHENG C, XUE M Y, HONG T T, et al.DC-BiGRU_CNN model for short-text classification[J].Computer Science, 2019, 46(11):186-192.(in Chinese)
[14] SACHIN S, TRIPATHI A, MAHAJAN N, et al.Sentiment analysis using gated recurrent neural networks[J].SN Computer Science, 2020, 1(2):1-13.
[15] HASSAN A, MAHMOOD A.Convolutional recurrent deep learning model for sentence classification[J].IEEE Access, 2018, 6:13949-13957.
[16] LIU B, ZHOU Y, SUN W.Character-level text classification via convolutional neural network and gated recurrent unit[J].International Journal of Machine Learning and Cybernetics, 2020, 11(8):1939-1949.
[17] 朱茂然, 王奕磊, 高松, 等.中文比较关系的识别:基于注意力机制的深度学习模型[J].情报学报, 2019, 38(6):612-621. ZHU M R, WANG Y L, GAO S, et al.A deep-learning model based on attention mechanism for Chinese comparative relation detection[J].Journal of the China Society for Scientific and Technical Information, 2019, 38(6):612-621.(in Chinese)
[18] 程艳, 尧磊波, 张光河, 等.基于注意力机制的多通道CNN和BiGRU的文本情感倾向性分析[J].计算机研究与发展, 2020, 57(12):2583-2595. CHENG Y, YAO L B, ZHANG G H, et al.Text sentiment orientation analysis of multi-channels CNN and BiGRU based on attention mechanism[J].Journal of Computer Research and Development, 2020, 57(12):2583-2595.(in Chinese)
[19] 袁和金, 张旭, 牛为华, 等.融合注意力机制的多通道卷积与双向GRU模型的文本情感分析研究[J].中文信息学报, 2019, 33(10):109-118. YUAN H J, ZHANG X, NIU W H, et al.Sentiment analysis based on multi-channel convolution and bi-directional GRU with attention mechanism[J].Journal of Chinese Information Processing, 2019, 33(10):109-118.(in Chinese)
[20] 余本功, 朱梦迪.基于层级注意力多通道卷积双向GRU的问题分类研究[J].数据分析与知识发现, 2020, 4(8):50-62. YU B G, ZHU M D.Question classification based on bidirectional GRU with hierarchical attention and multi-channel convolution[J].Data Analysis and Knowledge Discovery, 2020, 4(8):50-62.(in Chinese)
[21] CHENG K F, YUE Y N, SONG Z W.Sentiment classification based on part-of-speech and self-attention mechanism[J].IEEE Access, 2020, 8:16387-16396.
[22] KUMAR A, RASTOGI NEE KHEMCHANDANI R.Self-attention enhanced recurrent neural networks for sentence classification[C]//Proceedings of 2018 IEEE Symposium Series on Computational Intelligence.Washington D.C., USA:IEEE Press, 2018:905-911.
[23] XIE J, CHEN B, GU X L, et al.Self-attention-based BiLSTM model for short text fine-grained sentiment classification[J].IEEE Access, 2019, 7:180558-180570.
[24] LI W J, QI F, TANG M, et al.Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification[J].Neurocomputing, 2020, 387:63-77.
[25] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, USA:Association for Computational Linguistics, 2014:1746-1751.
[26] JOULIN A, GRAVE E, BOJANOWSKI P, et al.Bag of tricks for efficient text classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics:Volume 2, Short Papers.Stroudsburg, USA:Association for Computational Linguistics, 2017:427-431.
[27] LAI S W, XU L H, LIU K, et al.Recurrent convolutional neural networks for text classification[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence.[S.1.]:AAAI Press, 2015:2267-2273.

选择文件类型/文献管理软件名称

选择包含的内容