作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (6): 127-135. doi: 10.19678/j.issn.1000-3428.0068324

• 人工智能与模式识别 • 上一篇    下一篇

基于双超图神经网络特征融合的文本分类

郑诚1,2,*(), 李鹏飞1,2   

  1. 1. 安徽大学计算机科学与技术学院, 安徽 合肥 230601
    2. 计算智能与信号处理教育部重点实验室, 安徽 合肥 230601
  • 收稿日期:2023-09-05 出版日期:2025-06-15 发布日期:2025-06-05
  • 通讯作者: 郑诚
  • 基金资助:
    安徽省重点研究与开发计划项目(202004d07020009)

Text Classification Based on Feature Fusion of Dual Hypergraph Neural Networks

ZHENG Cheng1,2,*(), LI Pengfei1,2   

  1. 1. School of Computer Science and Technology, Anhui University, Hefei 230601, Anhui, China
    2. Key Laboratory of Computational Intelligence and Signal Processing, Ministry of Education, Hefei 230601, Anhui, China
  • Received:2023-09-05 Online:2025-06-15 Published:2025-06-05
  • Contact: ZHENG Cheng

摘要:

近年来, 图神经网络(GNN)在文本分类任务中受到广泛应用。当前基于GNN的文本分类模型首先将文本建模为图, 然后使用GNN对文本图进行特征传播与聚合, 但是此类方法有两点不足: 一是现有模型由于图结构的限制无法捕获单词之间的高阶语义关系; 二是现有模型无法捕获文本中的关键语义信息。为了解决上述问题, 提出一种基于双超图卷积网络特征融合的文本分类模型。一方面, 使用原始文本建立文本超图; 另一方面, 为短文本引入外部知识, 使用基于SenticNet词库的外部知识对文本进行语义增强, 构建语义超图。经过超图卷积后通过注意力机制对双超图特征进行融合, 实现短文本分类。在4个文本分类数据集上的实验结果表明, 该模型优于基线模型, 具有优越的文本分类性能。

关键词: 文本分类, 超图, 特征融合, SenticNet词库, 自然语言处理

Abstract:

In recent years, Graph Neural Networks (GNNs) have been widely used for text classification tasks. Current models based on GNNs first model the text as a graph and then use GNNs to propagate and aggregate the features of the text graph. However, these methods have two notable limitations. First, existing models cannot capture high-order semantic relationships between words because of the limitations of graph structures. Second, existing models cannot capture key semantic information from the text. To address these issues, this paper proposes a text classification model based on the feature fusion of dual hypergraph convolutional networks. On one hand, the original text is used to construct a text hypergraph; on the other hand, external knowledge is introduced for short texts. The text is semantically enhanced using external knowledge based on the SenticNet lexicon, and a semantic hypergraph is constructed. After hypergraph convolution, an attention mechanism is used to fuse the features of the dual hypergraphs for short-text classification. Experimental results on four text classification datasets show that the proposed model outperforms the baseline methods and demonstrates superior text classification performance.

Key words: text classification, hypergraph, feature fusion, SenticNet lexicon, natural language processing