作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (10): 72-79. doi: 10.19678/j.issn.1000-3428.0066266

• 人工智能与模式识别 • 上一篇    下一篇

融合句法树多信息学习方面级情感分析

张文豪1, 廖列法1,2, 王茹霞1   

  1. 1. 江西理工大学 信息工程学院, 江西 赣州 341000
    2. 江西理工大学 软件工程学院, 南昌 330000
  • 收稿日期:2022-11-15 出版日期:2023-10-15 发布日期:2023-01-12
  • 作者简介:

    张文豪(1997—),男,硕士研究生,主研方向为自然语言处理、情感分析

    廖列法,教授、博士

    王茹霞,硕士研究生

  • 基金资助:
    国家自然科学基金(71761018)

Aspect-Level Sentiment Analysis for Multi-Information Learning by Fusing Syntax Trees

Wenhao ZHANG1, Liefa LIAO1,2, Ruxia WANG1   

  1. 1. School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, Jiangxi, China
    2. School of Software Engineering, Jiangxi University of Science and Technology, Nanchang 330000, China
  • Received:2022-11-15 Online:2023-10-15 Published:2023-01-12

摘要:

近年来的方面级情感分析基本都是单一地进行语义信息或语法信息的挖掘,未建立语义信息和语法信息之间的关联,且已有模型大多都是单一地将词的相对距离或语法距离嵌入模型中,忽略了相对距离和语法距离对方面词的联合影响, 同时未充分考虑单词在依存句法树中的位置关系。建立一种融合句法树多信息学习的方面级情感分析模型MILFST,有针对性地利用不同神经网络的优点进行模型构建,以获得更为丰富的信息。通过双向长短时记忆网络捕捉文本序列的信息,根据依存句法树的树形结构更新序列信息,将相对距离和语法距离位置信息嵌入文本序列中,分别通过卷积神经网络和图卷积网络学习语义信息和语法信息。通过注意力机制实现语义信息和语法信息的优化融合,并将融合后的信息输入Softmax分类器中进行情感极性分类。实验结果表明,在Twitter、Lap14、Res14、Res15、Res16数据集上,MILFST模型的准确率和F1值分别为74.27%和73.14%、77.74%和74.27%、82.50%和74.54%、81.73%和66.15%、89.61%和71.57%,模型中的树形结构有助于对信息的捕获,同时兼顾语法信息与语义信息的学习有利于方面词情感极性判断。

关键词: 语义信息, 语法信息, 卷积神经网络, 图卷积网络, 注意力机制

Abstract:

In recent years, aspect-level sentiment analysis has mostly focused on mining semantic or syntactic information, without establishing an association between the two. In most existing models, the relative or syntactic distance of words is embedded into the models, ignoring the joint impact of relative and syntactic distances on aspect words and thus failing to fully consider the positional relationship of words in dependency syntax trees. In this study, an aspect-level sentiment analysis model, MILFST, is established for multi-information learning. This model integrates the syntactic tree and is expressly constructed to utilize the advantages of different neural networks with the purpose of obtaining richer information. The information of text sequences is captured through a bidirectional long and short term memory network, whereby the sequence information is updated based on the structure of the dependency syntax tree. With relative and syntactic distance positional information embedded into the text sequence, the learning of semantic and syntactic information is achieved through Convolutional Neural Networks (CNN) and Graph Convolutional Networks (GCN), respectively. The fusion of semantic and syntactic information is optimized through the attention mechanism, and the fused information is input into the Softmax classifier for sentiment polarity classification. The experimental results show that on the Twitter, Lap14, Res14, Res15, and Res16 datasets, the accuracies of the MILFST model are 74.27%, 77.74%, 82.50%, 81.73%, and 89.61%, respectively, and the F1 values are 73.14%, 74.27%, 74.54%, 66.15%, and 71.57%, respectively. The tree structure in the model helps capture information, and balancing the learning of syntactic and semantic information is beneficial for determining the emotional polarity of aspect words.

Key words: semantic information, syntactic information, Convolutional Neural Networks(CNN), Graph Convolutional Networks(GCN), attention mechanism