Method for Few-Shot Short Text Classification Based on Heterogeneous Graph Convolutional Network

doi:10.19678/j.issn.1000-3428.0059920

Abstract

Abstract: To solve the problem of semantic sparseness and overfitting in few-shot classification of short texts, this paper proposes a method for few-shot short text classification, which uses the dual-attention mechanism of a heterogeneous graph convolutional network to learn the importance of different neighbor nodes and the importance of different node types to the current node.The BTM is used to extract topic information from the short text datasets, and then a heterogeneous information network that can integrate entities and topic information is constructed for short texts to solve the problem of semantic sparseness.On this basis, a heterogeneous graph convolutional network using a dual-level attention mechanism and a method for random neighbor reduction is constructed to extract semantic information from the heterogeneous information network.At the same time, the method for random neighbor reduction is used for data enhancement to alleviate the problem of over-fitting.The experimental results on three short text datasets show that compared with the benchmark models such as LSTM, Text GCN and HGAT, the proposed model still achieves state-of-the-art performance when there are only ten labeled samples in per class.

Key words: few-shot short text classification, heterogeneous graph convolution network, heterogeneous information network for short text, BTM topic model, over fitting

摘要： 针对小样本短文本分类过程中出现的语义稀疏与过拟合问题，在异构图卷积网络中利用双重注意力机制学习不同相邻节点的重要性和不同节点类型对当前节点的重要性，构建小样本短文本分类模型HGCN-RN。利用BTM主题模型在短文本数据集中提取主题信息，构造一个集成实体和主题信息的短文本异构信息网络，用于解决短文本语义稀疏问题。在此基础上，构造基于随机去邻法和双重注意力机制的异构图卷积网络，提取短文本异构信息网络中的语义信息，同时利用随机去邻法进行数据增强，用于缓解过拟合问题。在3个短文本数据集上的实验结果表明，与LSTM、Text GCN、HGAT等基准模型相比，该模型在每个类别只有10个标记样本的情况下仍能达到最优性能。

关键词: 小样本短文本分类, 异构图卷积网络, 短文本异构信息网络, BTM主题模型, 过拟合

CLC Number:

TP391

YUAN Ziyong, GAO Shu, CAO Jiao, CHEN Liangchen. Method for Few-Shot Short Text Classification Based on Heterogeneous Graph Convolutional Network[J]. Computer Engineering, 2021, 47(12): 87-94.

袁自勇, 高曙, 曹姣, 陈良臣. 基于异构图卷积网络的小样本短文本分类方法[J]. 计算机工程, 2021, 47(12): 87-94.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0059920

http://www.ecice06.com/EN/Y2021/V47/I12/87

Figures/Tables 8

References

[1] ALSMADI I, GAN K H.Review of short-text classification[J].International Journal of Web Information Systems, 2019, 15(2):155-182.
[2] 高云龙, 吴川, 朱明.基于改进卷积神经网络的短文本分类模型[J].吉林大学学报(理学版), 2020, 58(4):923-940. GAO Y L, WU C, ZHU M.Short text classification model based on improved convolutional neural network[J].Journal of Jilin University(Science Edition), 2020, 58(4):923-940.(in Chinese)
[3] JOULIN A, GRAVE E, BOJANOWSKI P, et al.Bag of tricks for efficient text classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.[S.l.]:ACL Press, 2017:427-431.
[4] SHIMURA K, LI J, FUKUMOTO F.HFT-CNN:learning hierarchical category structure for multi-label short text categorization[C]//Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing.[S.l.]:ACL Press, 2018:811-816.
[5] SINHA K, DONG Y, CHEUNG J C K, et al.A hierarchical neural attention-based text classifier[C]//Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing.[S.l.]:ACL Press, 2018:817-823.
[6] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing.[S.l.]:ACL Press, 2014:1746-1751.
[7] ZHANG X, ZHAO J B, LECUN Y.Character-level convolutional networks for text classification[C]//Proceedings of NIPS'15.Cambridge, USA:MIT Press, 2015:649-657.
[8] WANG J, WANG Z Y, ZHANG D W, et al.Combining knowledge with deep convolutional neural networks for short text classification[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence.New York, USA:IJCAI Press, 2017:2915-2921.
[9] HOWARD J, RUDER S.Universal language model fine-tuning for text classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL Press, 2018:328-339.
[10] GENG R Y, LI B H, LI Y B, et al.Few-shot text classification with induction network[C]//Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.[S.l.]:ACL Press, 2019:3904-3913.
[11] JASON W, KAI Z.EDA:easy data augmentation techniques for boosting performance on text classification tasks[C]//Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.[S.l.]:ACL Press, 2019:6381-6387.
[12] YAO L, MAO C S, LUO Y.Graph convolutional networks for text classification[C]//Proceedings of 2019 AAAI Conference on Artificial Intelligence.Palo Alto, USA:AAAI Press, 2019:7370-7377.
[13] HU L M, YANG T C, SHI C, et al.Heterogeneous graph attention networks for semi-supervised short text classification[C]//Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.[S.l.]:ACL Press, 2019:4821-4830.
[14] YAN X H, GUO J F, LAN Y Y, et al.A biterm topic model for short texts[C]//Proceedings of the 22nd International Conference on World Wide Web.New York, USA:ACM Press, 2013:1445-1456.
[15] FERRAGINA P, SCAIELLA U.TAGME:on-the-fly annotation of short text fragments(by Wikipedia entities)[C]//Proceedings of the 19th ACM International Conference on Information and Knowledge Management.New York, USA:ACM Press, 2010:1625-1628.
[16] RONG Y, HUANG W B, XU T Y, et al.DropEdge:towards deep graph convolutional networks on node classification[EB/OL].[2020-10-11].https://openreview.net/forum?id=Hkx1qkrKPr.
[17] LIU P F, QIU X P, HUANG X J.Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence.New York, USA:IJCAI/AAAI Press, 2016:2873-2879.
[18] TANG J, QU M, MEI Q Z.PTE:predictive text embedding through large-scale heterogeneous text networks[C]//Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York, USA:ACM Press, 2015:1165-1174.
[19] WANG G Y, LI C Y, WANG W L, et al.Joint embedding of words and labels for text classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL Press, 2018:2321-2331.
[20] PHAN X H, NGUYEN L M, HORIGUCHI S.Learning to classify short and sparse text & Web with hidden topics from large-scale data collections[C]//Proceedings of the 17th International Conference on World Wide Web.New York, USA:ACM Press, 2008:91-100.
[21] BO P, LEE L.Seeing Stars:exploiting class relationships for sentiment categorization with respect to rating scales[C]//Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL Press, 2005:115-124.

Please choose a citation manager

Content to export