作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (7): 104-111. doi: 10.19678/j.issn.1000-3428.0068132

• 人工智能与模式识别 • 上一篇    下一篇

基于对比学习和注意力机制的文本分类方法

钱来*(), 赵卫伟   

  1. 国防科技大学信息通信学院, 湖北 武汉 430010
  • 收稿日期:2023-07-24 出版日期:2024-07-15 发布日期:2024-07-26
  • 通讯作者: 钱来
  • 基金资助:
    国家部委基金

Text Classification Method Based on Contrastive Learning and Attention Mechanism

Lai QIAN*(), Weiwei ZHAO   

  1. School of Information and Communication, National University of Defense Technology, Wuhan 430010, Hubei, China
  • Received:2023-07-24 Online:2024-07-15 Published:2024-07-26
  • Contact: Lai QIAN

摘要:

文本分类作为自然语言处理领域的基本任务, 在信息检索、机器翻译和情感分析等应用中发挥着重要作用。然而大多数深度模型在预测时未充分考虑训练实例的丰富信息, 导致学到的文本特征不够全面。为了充分利用训练实例信息, 提出一种基于对比学习和注意力机制的文本分类方法。首先, 设计一种有监督对比学习训练策略, 旨在优化模型对文本向量表征的检索, 提高模型在推理过程中检索到的训练实例的质量; 然后, 构建注意力机制, 对获取的训练文本特征进行注意力分布学习, 聚焦关联性更强的相邻实例信息, 获得更多隐含的相似特征; 最后, 将注意力机制与模型网络相结合, 融合相邻的训练实例信息, 增强模型提取多样性特征的能力, 实现全局特征和局部特征的提取。实验结果表明, 所提方法在卷积神经网络(CNN)、双向长短期记忆网络(BiLSTM)、图卷积网络(GCN)、BERT和RoBERTa等多个模型上都取得了显著的性能提升。以CNN模型为例, 其在THUCNews数据集、今日头条数据集和搜狗数据集上宏F1值分别提高了4.15、6.2和1.92个百分点。因此, 该方法也为文本分类任务提供了一种有效的解决方案。

关键词: 文本分类, 深度模型, 对比学习, 近似最近邻算法, 注意力机制

Abstract:

Text classification is a basic task in the field of natural language processing and plays an important role in information retrieval, machine translation, sentiment analysis, and other applications. However, most deep learning models do not fully consider the rich information in training instances during inference, resulting in inadequate text feature learning. To leverage training instance information fully, this paper proposes a text classification method based on contrastive learning and attention mechanism. First, a supervised contrastive learning training strategy is designed to optimize the retrieval of text vector representations, thereby improving the quality of the retrieved training instances during the inference process. Second, an attention mechanism is constructed to learn the attention distribution of the obtained training text features, focusing on adjacent instance information with stronger relevance and capturing more implicit similarity features. Finally, the attention mechanism is combined with the model network, fusing information from adjacent training instances to enhance the ability of the model to extract diverse features and achieve global and local feature extraction. The experimental results demonstrate that this method achieves significant improvements on various models, including Convolutional Neural Network(CNN), Bidirectional Long Short-Term Memory(BiLSTM), Graph Convolutional Network(GCN), Bidirectional Encoder Representations from Transformers(BERT), and RoBERTa. For the CNN model, the macro F1 value is increased by 4.15, 6.2, and 1.92 percentage points for the THUCNews, Toutiao, and Sogou datasets, respectively. Therefore, this method provides an effective solution for text classification tasks.

Key words: text classification, deep model, contrastive learning, approximate nearest neighbor algorithm, attention mechanism