| 1 |
BALTRUSAITIS T , AHUJA C , MORENCY L P . Multimodal machine learning: a survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41 (2): 423- 443.
doi: 10.1109/TPAMI.2018.2798607
|
| 2 |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 30-38.
|
| 3 |
刘建伟, 刘俊文, 罗雄麟. 深度学习中注意力机制研究进展. 工程科学学报, 2021, 43 (11): 1499- 1511.
|
|
LIU J W , LIU J W , LUO X L . Research progress of attention mechanism in deep learning. Journal of Engineering Science, 2021, 43 (11): 1499- 1511.
|
| 4 |
HELMI SETYAWAN M Y, AWANGGA R M, EFENDI S R. Comparison of multinomial naive Bayes algorithm and logistic regression for intent classification in chatbot[C]//Proceedings of the International Conference on Applied Engineering. Washington D. C., USA: IEEE Press, 2018: 1-5.
|
| 5 |
刘娇, 李艳玲, 林民. 人机对话系统中意图识别方法综述. 计算机工程与应用, 2019, 55 (12): 1- 7.
|
|
LIU J , LI Y L , LIN M . Summary of intention recognition methods in man-machine dialogue system. Computer Engineering and Application, 2019, 55 (12): 1- 7.
|
| 6 |
|
| 7 |
WANG J X , WEI K , RADFAR M , et al. Encoding syntactic knowledge in transformer encoder for intent detection and slot filling. Artificial Intelligence, 2021, 35 (16): 13943- 13951.
|
| 8 |
LIU X K , LI J Q , MU J J , et al. Effective open intent classification with K-center contrastive learning and adjustable decision boundary. Artificial Intelligence, 2023, 37 (11): 13291- 13299.
|
| 9 |
CASANUEVA I, TEM AČG INAS T, GERZ D, et al. Efficient intent detection with dual sentence encoders[C]//Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI. Stroudsburg, USA: ACL Press, 2020: 38-45.
|
| 10 |
HUANG Y, DU C, XUE Z, et al. What makes multi-modal learning better than single(provably)[C]//Proceedings of the Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2021, 34: 10944-10956.
|
| 11 |
程大雷, 张代玮, 陈雅茜. 多模态情感识别综述. 西南民族大学(自然科学版), 2022, 48 (4): 440- 447.
|
|
CHENG D L , ZHANG D W , CHEN Y Q . A summary of multimodal emotion recognition. Journal of Southwest Minzu University(Natural Science Edition), 2022, 48 (4): 440- 447.
|
| 12 |
HASAN M K , LEE S W , RAHMAN W , et al. Humor knowledge enriched transformer for understanding multimodal humor. Artificial Intelligence, 2021, 35 (14): 12972- 12980.
|
| 13 |
ZHANG H L, XU H, WANG X, et al. MIntRec: a new dataset for multimodal intent recognition[C]//Proceedings of the 30th ACM International Conference on Multimedia. New York, USA: ACM Press, 2022: 1688-1697.
|
| 14 |
ZHAN L M, LIANG H, LIU B, et al. Out-of-scope intent detection with self-supervision and discriminative training[EB/OL]. [2024-04-30]. https://arxiv.org/pdf/2106.08616.
|
| 15 |
ZHOU Y H, LIU P J, QIU X P. KNN-contrastive learning for out-of-domain intent classification[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2022: 5129-5141.
|
| 16 |
GANDHI A , ADHVARYU K , PORIA S , et al. Multimodal sentiment analysis: a systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Information Fusion, 2023, 91, 424- 444.
doi: 10.1016/j.inffus.2022.09.025
|
| 17 |
HAZARIKA D, ZIMMERMANN R, PORIA S. MISA: Modality-invariant and-specific representations for multimodal sentiment analysis[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York, USA: ACM Press, 2020: 1122-1131.
|
| 18 |
|
| 19 |
LIU Z, SHEN Y, LAKSHMINARASIMHAN V, et al. Efficient low-rank multimodal fusion with modality-specific factors[EB/OL]. [2024-04-30]. https://arxiv.org/pdf/1806.00064.
|
| 20 |
TSAI Y H, BAI S J, LIANG P P, et al. Multimodal transformer for unaligned multimodal language sequences[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2019: 6558.
|
| 21 |
RAHMAN W, HASAN M K, LEE S W, et al. Integrating multimodal information in large pretrained transformers[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2020: 2359.
|
| 22 |
HAN W, CHEN H, PORIA S. Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis[EB/OL]. [2024-04-30]. https://arxiv.org/pdf/2109.00412.
|
| 23 |
YU W M , XU H , YUAN Z Q , et al. Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis. Artificial Intelligence, 2021, 35 (12): 10790- 10797.
|
| 24 |
DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. [2024-04-30]. https://arxiv.org/pdf/1810.04805.
|
| 25 |
LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 10012-10022.
|
| 26 |
CHEN S Y , WANG C Y , CHEN Z Y , et al. WavLM: large-scale self-supervised pre-training for full stack speech processing. IEEE Journal of Selected Topics in Signal Processing, 2022, 16 (6): 1505- 1518.
doi: 10.1109/JSTSP.2022.3188113
|
| 27 |
ZHANG H L, WANG X, XU H, et al. MIntRec2.0: a large-scale benchmark dataset for multimodal intent recognition and out-of-scope detection in conversations [EB/OL]. [2024-04-30]. https://arxiv.org/pdf/2403.10943.
|
| 28 |
PORIA S, HAZARIKA D, MAJUMDER N, et al. MELD: a multimodal multi-party dataset for emotion recognition in conversations[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2019: 527-536.
|