基于多粒度图与注意力机制的半监督短文本分类

doi:10.19678/j.issn.1000-3428.0066475

摘要/Abstract

摘要： 短文本语义稀疏模糊、蕴含信息不足、表达不规则等缺陷给短文本分类任务带来了极大的挑战,且现有短文本分类方法通常忽略词项间的交互信息,不能充分挖掘隐含的语义信息,导致分类效率低下。针对上述问题,提出一种基于多粒度图与注意力机制的半监督短文本分类模型MgGAt。该模型在词粒度和文本粒度基础上构建2种类型的图,通过充分挖掘语义信息完成分类任务。首先构建词级图,捕获词嵌入,进而学习得到文本特征表示。在词级图上引入跳内注意力和跳间注意力,从多种语义角度有效提取词项间隐含的高阶信息,捕获语义丰富的词嵌入。同时依据词级子图的特点设计池化策略,聚合词嵌入,学习文本表征。其次构建文本级图,借助部分已知的标签信息,利用图神经网络的优势,在图上执行标签传播和推理,完成半监督短文本分类任务。在4个公开数据集上的实验结果表明,与基线模型相比,MgGAt模型的短文本分类精确率平均提升了1.18个百分点,F1值平均提升了1.37个百分点,具有更好的分类性能。

关键词: 短文本分类, 半监督分类, 图神经网络, 注意力机制, 多粒度图

Abstract: Sparse and fuzzy semantics, insufficient information, and irregular expressions in short texts pose great challenges to short text classification tasks. Moreover, the existing short text classification methods ignore the interactive information between terms, and implicit semantics cannot be fully exploited; therefore, they are classified inefficiently. To address these problems, a semi-supervised short text classification method based on multi-grained graphs and attention mechanism, named MgGAt, is proposed. Two types of graphs are constructed based on word and text granularities, and semantic information is fully mined to perform classification task. First, the model builds a word-level graph, captures word embeddings, and learns the feature representations of a short text. Specifically, intra- and inter-hop attention are introduced on a word-level graph to effectively extract high-order information from various semantic perspectives that are hidden in word terms and obtain word embeddings with rich semantics. Simultaneously, a pooling strategy is designed according to the characteristics of the word embeddings, which are aggregated into text vectors. Thereafter, a text-level graph is constructed, and with the help of part of the labeled information, the advantage of the Graph Neural Network(GNN) is used to perform label propagation and reasoning on the graph to achieve semi-supervised short text classification. Experimental results on four public datasets demonstrate that, compared with baseline models, the classification accuracy and F1 value of the proposed MgGAt increased by 1.18 and 1.37 percentage points respectively, on average, resulting in better classification performance.

Key words: short text classification, semi-supervised classification, Graph Neural Network(GNN), attention mechanism, multi-grained graph

中图分类号:

TP18

游奔, 李晓红, 姚锦, 冯绍杰. 基于多粒度图与注意力机制的半监督短文本分类[J]. 计算机工程, 2024, 50(5): 83-90.

YOU Ben, LI Xiaohong, YAO Jin, FENG Shaojie. Semi-supervised Classification for Short Text Based on Multi-grained Graphs and Attention Mechanism[J]. Computer Engineering, 2024, 50(5): 83-90.

https://www.ecice06.com/CN/Y2024/V50/I5/83

参考文献

[1] LI Q, PENG H, LI J, et al. A survey on text classification:from traditional to deep learning[J]. ACM Transactions on Intelligent Systems and Technology, 2022, 13(2):1-41.
[2] PHAN X H, NGUYEN L M, HORIGUCHI S. Learning to classify short and sparse text & Web with hidden topics from large-scale data collections[C]//Proceedings of the 17th International Conference on World Wide Web. New York, USA:ACM Press, 2008:91-100.
[3] BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation[J]. The Journal of Machine Learning Research, 2003, 3:993-1022.
[4] AGGARWAL C C, ZHAI C X. A survey of text classification algorithms[M]//AGGARWAL C, ZHAI C. Mining text data. Berlin, Germany:Springer, 2012:163-222.
[5] YIN C Y, XIANG J, ZHANG H, et al. A new SVM method for short text classification based on semi-supervised learning[C]//Proceedings of the 4th International Conference on Advanced Information Technology and Sensor Application. Washington D. C., USA:IEEE Press, 2015:100-103.
[6] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA:Association for Computational Linguistics, 2014:1746-1751.
[7] 陈可嘉, 刘惠. 基于改进BiGRU-CNN的中文文本分类方法[J]. 计算机工程, 2022, 48(5):59-66, 73. CHEN K J, LIU H. Chinese text classification method based on improved BiGRU-CNN[J]. Computer Engineering, 2022, 48(5):59-66, 73.(in Chinese)
[8] LIU P F, QIU X P, HUANG X J, et al. Recurrent neural networks for text classification with multi-task learning[C]//Proceedings of International Joint Conference on Artificial Intelligence. New York, USA:IJCAI Press, 2016:2873-2879.
[9] 丁辰晖, 夏鸿斌, 刘渊. 融合知识图谱与注意力机制的短文本分类模型[J]. 计算机工程, 2021, 47(1):94-100. DING C H, XIA H B, LIU Y. Short text classification model combining knowledge graph and attention mechanism[J]. Computer Engineering, 2021, 47(1):94-100.(in Chinese)
[10] WAIKHOM L, PATGIRI R. Graph neural networks:methods, applications, and opportunities[EB/OL].[2023-01-14]. https://arxiv.org/abs/2108.10733.
[11] HUANG L Z, MA D H, LI S J, et al. Text level graph neural network for text classification[C]//Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, USA:Association for Computational Linguistics, 2019:3444-3450.
[12] ZHANG Y F, YU X L, CUI Z Y, et al. Every document owns its structure:inductive text classification via graph neural networks[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA:Association for Computational Linguistics, 2020:334-339.
[13] PIAO Y H, LEE S, LEE D, et al. Sparse structure learning via graph neural networks for inductive document classification[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(10):11165-11173.
[14] YAO L, MAO C S, LUO Y. Graph convolutional networks for text classification[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1):7370-7377.
[15] YE Z D, JIANG G Y, LIU Y. Document and word representations generated by graph convolutional network and BERT for short text classification[C]//Proceedings of the 24th European Conference on Artificial Intelligence.[S.l.]:IOS Press, 2020:2275-2281.
[16] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[C]//Proceedings of the 5th International Conference on Learning Representations. Toulon, France:ICLR, 2017:1-10.
[17] ZHANG X Y, ZHANG T, ZHAO W T, et al. Dual-attention graph convolutional networks[EB/OL].[2023-01-14]. https://arxiv.org/abs/1911.12486.
[18] YANG T C, HU L M, SHI C, et al. HGAT:heterogeneous graph attention networks for semi-supervised short text classification[J]. ACM Transactions on Information Systems, 2021, 39(3):1-29.
[19] WANG Y Q, WANG S, YAO Q M, et al. Hierarchical heterogeneous graph representation learning for short text classification[C]//Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA:Association for Computational Linguistics, 2021:3091-3101.
[20] CARLSON A, BETTERIDGE J, KISIEL B, et al. Toward an architecture for never-ending language learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2010, 24(1):1306-1313.
[21] BIRD S, KLEIN E, LOPER E. Natural language processing with Python[EB/OL].[2023-01-14]. https://www.nltk.org.
[22] MARCUS M P, SANTORINI B, MARCINKIEWICZ M A. Building a large annotated corpus of English:the Penn Treebank[J]. Computational Linguistics, 1993, 19(2):313-330.
[23] PENNINGTION J, SOCHER R, MANNING C D. GloVe:global vectors for word representation[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA:Association for Computational Linguistics, 2014:1532-1543.
[24] HERSH W, BUCKLEY C, LEONE T J, et al. OHSUMED:an interactive retrieval evaluation and new large test collection for research[C]//Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA:ACM Press, 1994:192-201.
[25] VITALE D, FERRAGINA P, SCAIELLA U. Classification of short texts by deploying topical annotations[C]//Proceedings of the 34th European Conference on Advances in Information Retrieval. New York, USA:ACM Press, 2012:376-387.
[26] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(3):273-297.

选择文件类型/文献管理软件名称

选择包含的内容