Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2021, Vol. 47 ›› Issue (12): 87-94. doi: 10.19678/j.issn.1000-3428.0059920

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Method for Few-Shot Short Text Classification Based on Heterogeneous Graph Convolutional Network

YUAN Ziyong1, GAO Shu1, CAO Jiao2, CHEN Liangchen1,3   

  1. 1. College of Computer Science and Technology, Wuhan University of Technology, Wuhan 430063, China;
    2. Library Network Information Center, Yiyang Medical College, Yiyang, Hunan 413046, China;
    3. Applied Technology College, China University of Labor Relations, Beijing 100048, China
  • Received:2020-11-05 Revised:2020-12-31 Published:2020-12-09

基于异构图卷积网络的小样本短文本分类方法

袁自勇1, 高曙1, 曹姣2, 陈良臣1,3   

  1. 1. 武汉理工大学 计算机科学与技术学院, 武汉 430063;
    2. 益阳医学高等专科学校 图书馆网络信息中心, 湖南 益阳 413046;
    3. 中国劳动关系学院应用技术学院, 北京 100048
  • 作者简介:袁自勇(1995-),男,硕士研究生,主研方向为自然语言处理;高曙,教授、博士;曹姣,讲师、硕士;陈良臣(通信作者),副教授、博士研究生。
  • 基金资助:
    国家自然科学基金(51679180);中国劳动关系学院中央高校基本科研业务费专项资金项目(21ZYJS017)。

Abstract: To solve the problem of semantic sparseness and overfitting in few-shot classification of short texts, this paper proposes a method for few-shot short text classification, which uses the dual-attention mechanism of a heterogeneous graph convolutional network to learn the importance of different neighbor nodes and the importance of different node types to the current node.The BTM is used to extract topic information from the short text datasets, and then a heterogeneous information network that can integrate entities and topic information is constructed for short texts to solve the problem of semantic sparseness.On this basis, a heterogeneous graph convolutional network using a dual-level attention mechanism and a method for random neighbor reduction is constructed to extract semantic information from the heterogeneous information network.At the same time, the method for random neighbor reduction is used for data enhancement to alleviate the problem of over-fitting.The experimental results on three short text datasets show that compared with the benchmark models such as LSTM, Text GCN and HGAT, the proposed model still achieves state-of-the-art performance when there are only ten labeled samples in per class.

Key words: few-shot short text classification, heterogeneous graph convolution network, heterogeneous information network for short text, BTM topic model, over fitting

摘要: 针对小样本短文本分类过程中出现的语义稀疏与过拟合问题,在异构图卷积网络中利用双重注意力机制学习不同相邻节点的重要性和不同节点类型对当前节点的重要性,构建小样本短文本分类模型HGCN-RN。利用BTM主题模型在短文本数据集中提取主题信息,构造一个集成实体和主题信息的短文本异构信息网络,用于解决短文本语义稀疏问题。在此基础上,构造基于随机去邻法和双重注意力机制的异构图卷积网络,提取短文本异构信息网络中的语义信息,同时利用随机去邻法进行数据增强,用于缓解过拟合问题。在3个短文本数据集上的实验结果表明,与LSTM、Text GCN、HGAT等基准模型相比,该模型在每个类别只有10个标记样本的情况下仍能达到最优性能。

关键词: 小样本短文本分类, 异构图卷积网络, 短文本异构信息网络, BTM主题模型, 过拟合

CLC Number: