1 |
CHEN C H. Research on multi-modal mandarin speech emotion recognition based on SVM[C]//Proceedings of IEEE International Conference on Power, Intelligent Computing and Systems. Washington D. C., USA: IEEE Press, 2019: 173-176.
|
2 |
乔栋, 陈章进, 邓良, 等. 基于改进语音处理的卷积神经网络中文语音情感识别方法. 计算机工程, 2022, 48 (2): 281- 290.
URL
|
|
QIAO D, CHEN Z J, DENG L, et al. Method for Chinese speech emotion recognition based on improved speech-processing convolutional neural network. Computer Engineering, 2022, 48 (2): 281- 290.
URL
|
3 |
柳素红, 孙晓, 李春彬. 基于位置信息重建与时频域信息融合的脑电信号情感识别. 计算机工程, 2021, 47 (12): 95- 102.
URL
|
|
LIU S H, SUN X, LI C B. Emotion recognition using EEG signals based on location information reconstruction and time-frequency information fusion. Computer Engineering, 2021, 47 (12): 95- 102.
URL
|
4 |
TAN Y, SUN Z, DUAN F, et al. A multimodal emotion recognition method based on facial expressions and electroencephalography. Biomedical Signal Processing and Control, 2021, 70, 103029.
doi: 10.1016/j.bspc.2021.103029
|
5 |
DEVLIN J, CHANG M, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding[EB/OL]. [2022-05-17]. https://arxiv.org/abs/1810.04805.
|
6 |
HSU W N, BOLTE B, TSAI Y H H, et al. HuBERT: self-supervised speech representation learning by masked prediction of hidden units. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2021, 29, 3451- 3460.
doi: 10.1109/TASLP.2021.3122291
|
7 |
BALTRUŠAITIS T, ROBINSON P, MORENCY L P. OpenFace: an open source facial behavior analysis toolkit[C]//Proceedings of IEEE Winter Conference on Applications of Computer Vision. Washington D. C., USA: IEEE Press, 2016: 1-10.
|
8 |
|
9 |
|
10 |
|
11 |
YANG K C, XU H, GAO K. CM-BERT: cross-modal BERT for text-audio sentiment analysis[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York, USA: ACM Press, 2020: 521-528.
|
12 |
TSAI Y H, BAI S J, LIANG P P, et al. Multimodal Transformer for unaligned multimodal language sequences[C]//Proceedings of Association for Computational Linguistics Meeting. Philadelphia, USA: ACL Press, 2019: 6558-6569.
|
13 |
KWON S. MLT-DNet: speech emotion recognition using 1D dilated CNN based on multi-learning trick approach. Expert Systems with Applications, 2021, 167, 114177.
doi: 10.1016/j.eswa.2020.114177
|
14 |
CHUNG J, GULCEHRE C, CHO K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL]. [2022-05-17]. https://arxiv.org/abs/1412.3555.
|
15 |
TANG G, MÜLLER M, RIOS A, et al. Why self-attention? A targeted evaluation of neural machine translation architectures[EB/OL]. [2022-05-17]. https://arxiv.org/abs/1808.08946.
|
16 |
|
17 |
ZADEH A, PU P. Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Philadelphia, USA: ACL Press, 2018: 1-8.
|
18 |
BUSSO C, BULUT M, LEE C C, et al. IEMOCAP: interactive emotional dyadic motion capture database. Language Resources and Evaluation, 2008, 42 (4): 335- 359.
doi: 10.1007/s10579-008-9076-6
|
19 |
PHAM H, LIANG P P, MANZINI T, et al. Found in translation: learning robust joint representations by cyclic translations between modalities[C]//Proceedings of AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 6892-6899.
|
20 |
WANG Y, SHEN Y, LIU Z, et al. Words can shift: dynamically adjusting word representations using nonverbal behaviors[C]//Proceedings of 2019 AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 7216-7223.
|
21 |
MAI S J, HU H F, XU J, et al. Multi-fusion residual memory network for multimodal human sentiment comprehension. IEEE Transactions on Affective Computing, 2022, 13 (1): 320- 334.
doi: 10.1109/TAFFC.2020.3000510
|
22 |
LÜ F M, CHEN X, HUANG Y Y, et al. Progressive modality reinforcement for human multimodal emotion recognition from unaligned multimodal sequences[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 2554-2562.
|
23 |
SHENOY A, SARDANA A. Multilogue-Net: a context aware RNN for multi-modal emotion detection and sentiment analysis in conversation[EB/OL]. [2022-05-17]. https://arxiv.org/abs/2002.08267.
|
24 |
|
25 |
RAHMAN W, HASAN M K, LEE S W, et al. Integrating multimodal information in large pretrained Transformers[C]//Proceedings of the Conference Association for Computational Linguistics Meeting. Philadelphia, USA: ACL Press, 2020: 2359-2369.
|