[1] 周敬一,郭燕,丁友东.基于深度学习的中文影评情感分析[J].上海大学学报(自然科学版), 2018, 24(5):703-712. ZHOU J Y, GUO Y, DING Y D. Sentiment analysis of Chinese movie reviews based on deep learning[J]. Journal of Shanghai University (Natural Science Edition), 2018, 24(5):703-712.(in Chinese) [2] 谢丽星,周明,孙茂松.基于层次结构的多策略中文微博情感分析和特征抽取[J].中文信息学报, 2012, 26(1):73-83. XIE L X, ZHOU M, SUN M S. Hierarchical structure based hybrid approach to sentiment analysis of Chinese micro blog and its feature extraction[J]. Journal of Chinese Information Processing, 2012, 26(1):73-83.(in Chinese) [3] SOLEYMANI M, GARCIA D, JOU B, et al. A survey of multimodal sentiment analysis[J]. Image and Vision Computing, 2017, 65:3-14. [4] 朱鹤,陆小锋,薛雷.基于BERT的金融文本情感分析模型[J].上海大学学报(自然科学版), 2023, 29(1):118-128. ZHU H, LU X F, XUE L. Emotional analysis model of financial text based on the BERT[J]. Journal of Shanghai University (Natural Science Edition), 2023, 29(1):118-128.(in Chinese) [5] 李佩,陈乔松,陈鹏昌,等.基于模态特异及模态共享特征信息的多模态细粒度检索[J].计算机工程, 2022, 48(11):62-68, 76. LI P, CHEN Q S, CHEN P C, et al. Multi-modal fine-grained retrieval based on modal specific and modal shared feature information[J]. Computer Engineering, 2022, 48(11):62-68, 76.(in Chinese) [6] 王旭阳,庞文倩,赵丽婕.多模态方面级情感分析的多视图交互学习网络[J].计算机工程与应用, 2024, 60(7):92-100. WANG X Y, PANG W Q, ZHAO L J. Multiview interaction learning network for multimodal aspect-level sentiment analysis[J]. Computer Engineering and Applications, 2024, 60(7):92-100.(in Chinese) [7] 蒋雪瑶,力维辰,刘井平,等.基于多模态模式迁移的知识图谱实体配图[J].计算机工程, 2022, 48(8):70-76. JIANG X Y, LI W C, LIU J P, et al. Entity image collection based on multi-modality pattern transfer[J]. Computer Engineering, 2022, 48(8):70-76.(in Chinese) [8] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[EB/OL].[2023-05-05]. https://arxiv.org/abs/1703.05175. [9] CUI G, HU S, DING N, et al.Prototypical verbalizer for prompt-based few-shot tuning[EB/OL].[2023-05-05]. https://arxiv.org/abs/2203.09770. [10] YU W M, XU H, MENG F Y, et al. CH-SIMS:a Chinese multimodal sentiment analysis dataset with fine-grained annotation of modality[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics, 2020:3718-3727. [11] ZADEH A, ZELLERS R, PINCUS E, et al. Multimodal sentiment intensity analysis in videos:facial gestures and verbal messages[J]. IEEE Intelligent Systems, 2016, 31(6):82-88. [12] BAGHER Z A, LIANG P P, PORIA S, et al. Multimodal language analysis in the wild:CMU-MOSEI dataset and interpretable dynamic fusion graph[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics, 2018:2236-2246. [13] WILLIAMS J, KLEINEGESSE S, COMANESCU R, et al. Recognizing emotions in video using multimodal DNN feature fusion[C]//Proceedings of Grand Challenge and Workshop on Human Multimodal Language.[S.l.]:Association for Computational Linguistics, 2018:11-19. [14] TSAI Y H H, BAI S J, PU L P, et al. Multimodal transformer for unaligned multimodal language sequences[EB/OL].[2023-05-05]. https://arxiv.org/abs/1906.00295. [15] ZADEH A, LIANG P P, MAZUMDER N, et al. Memory fusion network for multi-view sequential learning[EB/OL].[2023-05-05]. https://arxiv.org/abs/1802.00927. [16] LIU Z, SHEN Y, LAKSHMINARASIMHAN V B, et al. Efficient low-rank multimodal fusion with modality-specific factors[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics, 2018:2247-2256. [17] YU W M, XU H, YUAN Z Q, et al. Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(12):10790-10797. [18] HAZARIKA D, ZIMMERMANN R, PORIA S. MISA:modality-invariant and-specific representations for multimodal sentiment analysis[C]//Proceedings of the 28th ACM International Conference on Multimedia.New York,USA:ACM Press,2020:1122-1131. [19] HAN W, CHEN H, GELBUKH A, et al. Bi-bimodal modality fusion for correlation-controlled multimodal sentiment analysis[C]//Proceedings of 2021 International Conference on Multimodal Interaction. New York,USA:ACM Press,2021:6-15. [20] RAHMAN W, HASAN M K, LEE S W, et al. Integrating multimodal information in large pretrained transformers[EB/OL].[2023-05-05]. https://arxiv.org/abs/1908.05787. [21] KENTON J, TOUTANOVA L. BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL].[2023-05-05]. https://arxiv.org/abs/1810.04805. [22] YANG Z, DAI Z, YANG Y, et al. XLNet:generalized autoregressive pretraining for language understanding[EB/OL].[2023-05-05]. https://arxiv.org/abs/1906.08237. [23] LI X L, LIANG P. Prefix-tuning:optimizing continuous prompts for generation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.[S.l.]:Association for Computational Linguistics, 2021:4582-4597. [24] SHIN T, RAZEGHI Y, LOGAN I R L, et al.AutoPrompt:eliciting knowledge from language models with automatically generated prompts[EB/OL].[2023-05-05]. https://arxiv.org/abs/2010.15980. [25] RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[EB/OL].[2023-05-05].https://arxiv.org/abs/2103.00020. [26] ZHOU K Y, YANG J K, LOY C C, et al. Learning to prompt for vision-language models[J]. International Journal of Computer Vision, 2022, 130(9):2337-2348. [27] RAO Y M, ZHAO W L, CHEN G Y, et al. DenseCLIP:language-guided dense prediction with context-aware prompting[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C.,USA:IEEE Press,2022:18082-18091. [28] LUO Z F, DONG Y X, ZHENG Q H, et al. Dual-channel graph contrastive learning for self-supervised graph-level representation learning[J]. Pattern Recognition, 2023, 139:109448. [29] 李政学,李枝名,彭德中,等.基于特征对比学习和图卷积的社交网络用户分类[J].计算机工程, 2024, 50(4):258-266. LI Z X, LI Z M, PENG D Z, et al. User classification of social networks based on feature contrastive learning and graph convolution[J]. Computer Engineering, 2024, 50(4):258-266.(in Chinese) [30] CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]//Proceedings of the 37th International Conference on Machine Learning. New York,USA:ACM Press,2020:1597-1607. [31] GUTMANN M, HYVÄRINEN A. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics[J]. Journal of Machine Learning Research, 2012, 13:307-361. [32] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780. |