[1]. 冯广,江家懿,罗时强,等.基于话语间时序多模态数据的情绪分析方法[J].计算机系统应用,2022,31(05):195-202.
FENG G, JIANG J Y, LUO S Q, et al.Sentiment Analysis Method Based on Temporal Multimodal Data Between Utterances[J].Computer Systems & Applications,2022,31(05):195-202.
[2]. 冯广,鲍龙.基于红外可见光融合的复杂环境下人脸识别方法[J].广东工业大学学报,2024,41(03):62-70,109.
FENG G, BAO L. Face Recognition Method in Complex Environment Based on Infrared Visible Fusion[J]. Journal of Guangdong University of Technology, 2024, 41(03): 62-70,109.
[3]. 吴亚迪, 陈平华. 基于用户长短期偏好和音乐情感注意力的音乐推荐模型[J]. 广东工业大学学报, 2023, 40(04): 37-44.
WU Y D, CHEN P H. A Music Recommendation Model Based on Users' Long and Short Term Preferences and Music Emotional Attention[J]. Journal of Guangdong University of Technology, 2023, 40(04): 37-44.
[4]. Mai S, Hu H, Xing S. Modality to modality translation: An adversarial representation learning and graph fusion network for multimodal fusion[C]//Proceedings of the AAAI conference on artificial intelligence. 2020, 34(01): 164-172.
[5]. Pandeya Y R, Lee J. Deep learning-based late fusion of multimodal information for emotion classification of music video[J]. Multimedia Tools and Applications, 2021, 80(2): 2887-2905.
[6]. Han W, Chen H, Gelbukh A, et al. Bi-bimodal modality fusion for correlation-controlled multimodal sentiment analysis[C]//Proceedings of the 2021 international conference on multimodal interaction. 2021: 6-15.
[7]. 袁萍萍. 基于对比学习的多模态特征融合情感分析算法研究[D]. 南昌: 南昌大学, 2023.
YUAN P P. Research on multimodal feature fusion sentiment analysis algorithm based on contrast learning[D].Nanchang: Nanchang University,2023.
[8]. Han W, Chen H, Poria S. Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021: 9180-9192.
[9]. Lin Z, Liang B, Long Y, et al. Modeling intra-and inter-modal relations: Hierarchical graph contrastive learning for multimodal sentiment analysis[C]//Proceedings of the 29th international conference on computational linguistics. Association for Computational Linguistics, 2022, 29(1): 7124-7135.
[10]. Ma F, Zhang Y, Sun X. Multimodal sentiment analysis with preferential fusion and distance-aware contrastive learning[C]//2023 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2023: 1367-1372.
[11]. ZADEH A, ZELLERS R, PINCUS E, et al. Multimodal sentiment intensity analysis in videos: facial gestures and verbal messages[J]. IEEE Intelligent Systems, 2016, 31(6): 82-88.
[12]. Zadeh A A B, Liang P P, Poria S, et al. Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018: 2236-2246.
[13]. ZADEH A, CHEN M, PORIA S. Tensor fusion network for multimodal sentiment analysis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017: 1103-1114.
[14]. LIU Z, SHEN Y, LAKSHMINARASIMHAN V B, et al. Efficient low-rank multimodal fusion with modality-specific factors [C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018: 2247-2256.
[15]. ZADEH A, LIANG P P, MAZUMDER N, et al. Memory fusion network for multi-view sequential learning[C]// Proceedings of the AAAI conference on artificial intelligence. 2018, 32(1).
[16]. HAZARIKA D, ZIMMERMANN R, PORIA S. Misa: Modality-invariant and-specific representations for multimodal sentiment analysis[C]//Proceedings of the 28th ACM international conference on multimedia. 2020: 1122-1131.
[17]. YU W, XU H, YUAN Z, et al. Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis[C]//Proceedings of the AAAI conference on artificial intelligence. 2021, 35(12): 10790-10797.
[18]. TSAI Y H H, BAI S, LIANG P P, et al. Multimodal transformer for unaligned multimodal language sequences[C]// Proceedings of the 57th Annual Meeting of The Association for Computational Linguistics, 2019: 6558-6569.
[19]. Wang D, Liu S, Wang Q, et al. Cross-modal enhancement network for multimodal sentiment analysis[J]. IEEE Transactions on Multimedia, 2022, 25: 4909-4921.
[20]. RAHMAN W, HASAN M K, LEE S, et al. Integrating multimodal information in large pretrained transformers[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020:2359-2369.
[21]. WANG L, PENG J, ZHENG C, et al. A cross modal hierarchical fusion multimodal sentiment analysis method based on multi-task learning[J]. Information Processing & Management, 2024, 61(3): 103675.
[22]. HUANG J, ZHOU J, TANG Z, et al. TMBL: transformer-based multimodal binding learning model for multimodal sentiment analysis[J]. Knowledge-Based Systems, 2024, 285: 111346.
[23]. Mai S, Sun Y, Zeng Y, et al. Excavating multimodal correlation for representation learning[J]. Information Fusion, 2023, 91: 542-555.
[24]. Lai S, Li J, Guo G, et al. Shared and private information learning in multimodal sentiment analysis with deep modal alignment and self-supervised multi-task learning[C]//2024 International Joint Conference on Neural Networks (IJCNN). IEEE, 2024: 1-8.
[25]. Jaiswal A, Babu A R, Zadeh M Z, et al. A survey on contrastive self-supervised learning[J]. Technologies, 2020, 9(1): 2.
[26]. Chen T, Kornblith S, Norouzi M, et al. A simple framework for contrastive learning of visual representations[C]//International conference on machine learning. PmLR, 2020: 1597-1607.
[27]. Gao T, Yao X, Chen D. Simcse: Simple contrastive learning of sentence embeddings[J]. arXiv preprint arXiv:2104.08821, 2021.
[28]. Radford A, Kim J W, Hallacy C, et al. Learning transferable visual models from natural language supervision[C]//International conference on machine learning. PmLR, 2021: 8748-8763.
[29]. Yang J, Yu Y, Niu D, et al. Confede: Contrastive feature decomposition for multimodal sentiment analysis[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023: 7617-7630.
[30]. Mai S, Zeng Y, Zheng S, et al. Hybrid contrastive learning of tri-modal representation for multimodal sentiment analysis[J]. IEEE Transactions on Affective Computing, 2022, 14(3): 2276-2289.
[31]. Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). 2019: 4171-4186.
[32]. Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.
[33]. HU Jie, SHEN Li, SUN Gang. Squeeze-and-Excitation Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE, 2018: 7132-7141.
[34]. WANG F, JIANG M, QIAN C, et al. Residual attention network for image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017: 3156–3164.
[35]. WU T, PENG J, ZHAGN W, et al. Video sentiment analysis with bimodal information-augmented multi-head attention[J]. Knowledge-Based Systems, 2022, 235: 107676.
[36]. LIN R, HU H. Multimodal contrastive learning via uni-modal coding and cross-modal prediction for multimodal sentiment analysis[C]// Findings of the Association for Computational Linguistics: EMNLP 2022, 2022: 522-523.
[37]. Liu S, Luo Z, Fu W. Fcdnet: fuzzy cognition-based dynamic fusion network for multimodal sentiment analysis[J]. IEEE Transactions on Fuzzy Systems, 2024,33(1): 3-14. |