结合局部感知与多层次注意力的多模态方面级情感分析

doi:10.19678/j.issn.1000-3428.0069705

计算机工程 ›› 2025, Vol. 51 ›› Issue (9): 80-90. doi: 10.19678/j.issn.1000-3428.0069705

结合局部感知与多层次注意力的多模态方面级情感分析

曾碧卿¹^,²^,*(), 姚勇涛¹, 谢梁琦¹, 陈鹏飞¹, 邓会敏³, 王瑞棠²

1. 华南师范大学软件学院, 广东佛山 528225
2. 华南师范大学阿伯丁数据科学与人工智能学院, 广东佛山 528225
3. 广东农工商职业技术学院计算机学院, 广东广州 510507

收稿日期:2024-04-07 修回日期:2024-05-16 出版日期:2025-09-15 发布日期:2024-06-25
通讯作者: 曾碧卿
基金资助:
广东省普通高校人工智能重点领域专项(2019KZDZX1033); 广东省基础与应用基础研究基金(2021A1515011171); 广州市基础研究计划基础与应用基础研究项目(202102080282)

Multimodal Aspect-Based Sentiment Analysis Combining Local Perception and Multi-Level Attention

ZENG Biqing¹^,²^,*(), YAO Yongtao¹, XIE Liangqi¹, CHEN Pengfei¹, DENG Huimin³, WANG Ruitang²

1. School of Software, South China Normal University, Foshan 528225, Guangdong, China
2. Aberdeen Institute of Data Science and Artificial Intelligence, South China Normal University, Foshan 528225, Guangdong, China
3. School of Computing Science, Guangdong Agriculture Industry Business Polytechnic, Guangzhou 510507, Guangdong, China

Received:2024-04-07 Revised:2024-05-16 Online:2025-09-15 Published:2024-06-25
Contact: ZENG Biqing

摘要/Abstract

摘要：

多模态方面级情感分析(MABSA)旨在从图文对中分析方面词的情感极性。现有方法致力于抽取图像和文本的情感特征。然而，图像和文本的各个特征不一定对最终的情感分析是有效的，图像和文本通常在方面词情感相关的区域外含有大量的冗余信息与噪声信息，并且图像和文本的不同区域可能对应不同方面词，导致在构建图像和文本特征抽取的初步阶段引入噪声。此外，图像和文本的方面词相关的情感极性可能是对立的，即两者存在交互信息。为了解决上述问题，提出结合局部感知与多层次注意力的MABSA模型。首先，设计局部感知模块，筛选与方面词语义相关的文本内容及图像区域；然后，引入多层次注意力模块，使用瓶颈注意力机制进行模态交互信息的提取，提高了情感信息的聚合准确率。实验结果表明，该模型能够在Twitter2015、Twitter2017、Multi-ZOL数据集上达到SOTA(State-of-the-Art)性能，显著优于同类模型。

关键词: 多模态方面级情感分析, 局部感知, 多层次注意力, 局部上下文, 瓶颈注意力

Abstract:

Multimodal Aspect-Based Sentiment Analysis (MABSA) aims to analyze the sentiment polarity of aspect words derived from text and image pairs. Existing methods primarily focus on extracting emotional features from both images and texts. However, the various features of images and texts may not necessarily be effective for the final sentiment analysis. Both images and text often contain a large amount of redundant and noisy information outside the areas related to aspect words, and different regions of images and text may be related to different aspect words. In the process of approximately establishing image text feature extraction, noise is introduced into multimodal aspect-level sentiment analysis tasks. In addition, the sentiment polarity related to aspect words in images and text may still be the opposite, implying an interactive information between the two. To address these issues, this paper proposes a multimodal aspect-level sentiment analysis model that combines local perception and multilevel attention. Specifically, the local perception module is designed to simultaneously select text content and image regions that are semantically relevant to aspect words. Subsequently, to improve the accuracy of sentiment aggregation, a multilevel attention module is introduced into the model, which uses a bottleneck attention mechanism to extract modal interaction information. The experimental results show that the model achieves State-Of-The-Art (SOTA) performance on the Twitter2015, Twitter2017, and Multi-ZOL datasets, significantly outperforms similar models.

Key words: Multimodal Aspect-Based Sentiment Analysis (MABSA), local perception, multi-level attention, local context, bottleneck attention

曾碧卿, 姚勇涛, 谢梁琦, 陈鹏飞, 邓会敏, 王瑞棠. 结合局部感知与多层次注意力的多模态方面级情感分析[J]. 计算机工程, 2025, 51(9): 80-90.

ZENG Biqing, YAO Yongtao, XIE Liangqi, CHEN Pengfei, DENG Huimin, WANG Ruitang. Multimodal Aspect-Based Sentiment Analysis Combining Local Perception and Multi-Level Attention[J]. Computer Engineering, 2025, 51(9): 80-90.

https://www.ecice06.com/CN/Y2025/V51/I9/80

图/表 13

图1 MABSA任务特点

Fig.1 Characteristics of MABSA task

图2 CLP-MLA模型架构

Fig.2 CLP-MLA model architecture

图3 跨模态注意力层

Fig.3 Crossmodal attention layer

图4 瓶颈注意力Transformer层

Fig.4 Bottleneck attention Transformer layer

图5 CLP-MLA的影响分析

Fig.5 Impact analysis of CLP-MLA

图6 CLP-MLA模块推理可视化

Fig.6 Visualization of reasoning in the CLP-MLA modules

参考文献 36

1	LI Y , DING H , LIN Y M , et al. Multi-level textual-visual alignment and fusion network for multimodal aspect-based sentiment analysis. Artificial Intelligence Review, 2024, 57 (4): 78.
2	ZHAO H , YANG M Y , BAI X Y , et al. A survey on multimodal aspect-based sentiment analysis. IEEE Access, 2024, 12, 12039- 12052. URL
3	YANG J , XU M Y , XIAO Y L , et al. AMIFN: aspect-guided multi-view interactions and fusion network for multimodal aspect-based sentiment analysis. Neurocomputing, 2024, 573, 127222.
4	SINGH U , ABHISHEK K , AZAD H K . A survey of cutting-edge multimodal sentiment analysis. ACM Computing Surveys, 2024, 56 (9): 1- 38.
5	杜孟洋, 王红斌, 普祥和. 融入词性自注意力机制的方面级情感分类方法. 吉林大学学报(理学版), 2023, 61 (6): 1375- 1386.
	DU M Y , WANG H B , PU X H . Aspect-level sentiment classification method incorporating part-of-speech self-attention mechanism. Journal of Jilin University Science Edition, 2023, 61 (6): 1375- 1386.
6	WANG Z Y , GUO J J . Self-adaptive attention fusion for multimodal aspect-based sentiment analysis. Mathematical Biosciences and Engineering, 2023, 21 (1): 1305- 1320.
7	YANG J , XIONG Y J . Bidirectional complementary correlation-based multimodal aspect-level sentiment analysis. International Journal on Semantic Web and Information Systems, 2024, 20 (1): 1- 16.
8	XU N, MAO W J, CHEN G D. Multi-interactive memory network for aspect based multimodal sentiment analysis[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, AAAI Press, 2019: 371-378.
9	TRUONG Q T, LAUW H W. VistaNet: visual aspect attention network for multimodal sentiment analysis[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, AAAI Press, 2019: 305-312.
10	YU J, JIANG J. Adapting BERT for target-oriented multimodal sentiment classification[EB/OL]. [2024-03-11]. https://www.ijcai.org/Proceedings/2019/0751.pdf.
11	MU J , NIE F P , WANG W , et al. MOCOLNet: a momentum contrastive learning network for multimodal aspect-level sentiment analysis. IEEE Transactions on Knowledge and Data Engineering, 2023, 36 (12): 8787- 8800.
12	PHAN H T , NGUYEN N T , HWANG D . Aspect-level sentiment analysis: a survey of graph convolutional network methods. Information Fusion, 2023, 91, 149- 172.
13	代巍, 王丰羽, 冀常鹏. 基于情感增强与双图卷积网络的方面级情感分析. 计算机工程, 2024, 50 (5): 120- 127. doi: 10.19678/j.issn.1000-3428.0067847
	DAI W , WANG F Y , JI C P . Aspect level sentiment analysis based on sentiment-enhanced and dual graph convolutional network. Computer Engineering, 2024, 50 (5): 120- 127. doi: 10.19678/j.issn.1000-3428.0067847
14	HOANG M, BIHORAC O A, ROUCES J. Aspect-based sentiment analysis using BERT[C]//Proceedings of the 22nd Nordic Conference on Computational Linguistics. Stroudsburg, USA: ACL Press, 2019: 187-196.
15	MA F K, HU X M, LIUAW, et al. AMR-based network for aspect-based sentiment analysis[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2023: 322-337.
16	HUANG B X, CARLEY K M. Syntax-aware aspect level sentiment classification with graph attention networks[EB/OL]. [2024-03-11]. https://arxiv.org/abs/1909.02606v1.
17	武星, 殷浩宇, 姚骏峰, 等. 面向视频数据的多模态情感分析. 计算机工程, 2024, 50 (6): 218- 227. doi: 10.19678/j.issn.1000-3428.0067874
	WU X , YIN H Y , YAO J F , et al. Multimodal sentiment analysis for video data. Computer Engineering, 2024, 50 (6): 218- 227. doi: 10.19678/j.issn.1000-3428.0067874
18	郭艳霞, 金勇, 唐宏, 等. 基于动态卷积与残差门控的多模态情感识别. 计算机工程, 2023, 49 (7): 94- 101. doi: 10.19678/j.issn.1000-3428.0064965
	GUO Y X , JIN Y , TANG H , et al. Multi-modal emotion recognition based on dynamic convolution and residual gating. Computer Engineering, 2023, 49 (7): 94- 101. doi: 10.19678/j.issn.1000-3428.0064965
19	WANKHADE M , RAO A C S , KULKARNI C . A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review, 2022, 55 (7): 5731- 5780.
20	DZEDZICKIS A , KAKLAUSKAS A , BUCINSKAS V . Human emotion recognition: review of sensors and methods. Sensors (Basel), 2020, 20 (3): 592.
21	WICK-PEDRO G, DA SILVA C F, INÁCIO M L, et al. Using large language models for identifying satirical news in Brazilian Portuguese[C]//Proceedings of the 16th International Conference on Computational Processing of Portuguese. Washington D.C., USA: IEEE Press, 2024: 156-167.
22	WANG D , GUO X T , TIAN Y M , et al. TETFN: a text enhanced Transformer fusion network for multimodal sentiment analysis. Pattern Recognition, 2023, 136, 109259.
23	SUN L C , LIAN Z , LIU B , et al. Efficient multimodal Transformer with dual-level feature restoration for robust multimodal sentiment analysis. IEEE Transactions on Affective Computing, 2023, 15 (1): 309- 325. doi: 10.1109/TAFFC.2023.3274829
24	BIRJALI M , KASRI M , BENI-HSSANE A . A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 2021, 226, 107134.
25	CAI Y , LI X , LI J . Emotion recognition using different sensors, emotion models, methods and datasets: a comprehensive review. Sensors (Basel), 2023, 23 (5): 2455.
26	JU X C, ZHANG D, XIAO R, et al. Joint multi-modal aspect-sentiment analysis with auxiliary cross-modal relation detection[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL Press, 2021: 4395-4405.
27	LING Y, YU J F, XIA R. Vision-language pre-training for multimodal aspect-based sentiment analysis[EB/OL]. [2024-03-11]. https://arxiv.org/abs/2204.07955v2.
28	YANG L , NA J C , YU J F . Cross-modal multitask transformer for end-to-end multimodal aspect-based sentiment analysis. Information Processing & Management, 2022, 59 (5): 103038.
29	HUANG C Q , ZHANG J L , WU X M , et al. TeFNA: Text-centered fusion network with crossmodal attention for multimodal sentiment analysis. Knowledge-Based Systems, 2023, 269, 110502.
30	MEWADA A , DEWANG R K . SA-ASBA: a hybrid model for aspect-based sentiment analysis using synthetic attention in pre-trained language BERT model with extreme gradient boosting. The Journal of Supercomputing, 2023, 79 (5): 5516- 5551.
31	KEERTHAN KUMAR T G, DHAKATE H, KOOLAGUDI S G. ⅡMH: intention identification in multimodal human utterances[C]//Proceedings of the 2023 15th International Conference on Contemporary Computing. New York, USA: ACM Press, 2023: 337-344.
32	HUANG J , LU P T , SUN S F , et al. Multimodal sentiment analysis in realistic environments based on cross-modal hierarchical fusion network. Electronics, 2023, 12 (16): 3504.
33	LI X, BING L D, ZHANG W X, et al. Exploiting BERT for end-to-end aspect-based sentiment analysis[EB/OL]. [2024-03-11]. https://arxiv.org/abs/1910.00883v2.
34	KHAN Z, FU Y. Exploiting BERT for multimodal target sentiment classification through input space translation[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York, USA: ACM Press, 2021: 3034-3042.
35	WANG J H, GAO Y, LI H K. An interactive attention mechanism fusion network for aspect-based multimodal sentiment analysis[C]//Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC). Washington D.C., USA: IEEE Press, 2023: 268-275.
36	HU X R , YAMAMURA M . Hierarchical fusion network with enhanced knowledge and contrastive learning for multimodal aspect-based sentiment analysis on social media. Sensors, 2023, 23 (17): 7330.

[1]	代广昭, 孙伟, 徐凡, 张小瑞, 陈旋, 常鹏帅, 汤毅, 胡亚华. 用于车辆重识别的视角感知局部注意力网络[J]. 计算机工程, 2022, 48(10): 288-297,305.
[2]	王旭阳, 萧波. 基于本体和局部上下文分析的查询扩展方法[J]. 计算机工程, 2012, 38(7): 57-59,69.
[3]	张超盟;李战怀;温宗臣. 局部上下文分析剪枝概念树的查询扩展[J]. 计算机工程, 2009, 35(14): 45-48.

选择文件类型/文献管理软件名称

选择包含的内容

结合局部感知与多层次注意力的多模态方面级情感分析

Multimodal Aspect-Based Sentiment Analysis Combining Local Perception and Multi-Level Attention

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 36

相关文章 3

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

结合局部感知与多层次注意力的多模态方面级情感分析

Multimodal Aspect-Based Sentiment Analysis Combining Local Perception and Multi-Level Attention

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 36

相关文章 3

编辑推荐

Metrics

本文评价