An Expression Recognition Algorithm Based on Term Frequency-Inverse Document Frequency and Hybrid Loss

doi:10.19678/j.issn.1000-3428.0063455

Abstract

Abstract: Facial expressions can express people's mental activities and state of mind naturally and efficiently.They profoundly affect people's communication process.In many intelligent applications, facial expression recognition is an important basis for establishing emotional interaction between humans and machines.In fine-grained facial expression recognition tasks, details are lost owing to the insufficient processing of key features in the facial expression-producing region by the network.A Term Frequency-Inverse Document Frequency Spatial Pyramid Attention(TF-IDF SPA) is proposed to adjust the attention distribution in the facial expression-producing region and strengthen the ability of the network to extract key detail features.Moreover, to deal with the common problem of small inter-class differences and large intra-class differences in facial expression recognition tasks, this paper proposes an improved hybrid-weighted loss function to enhance the cohesion of facial expression classes and increase the distance between classes.In addition, setting the weight value of the loss function dynamically according to the distribution of samples in the data set strengthens the learning ability of the model for small data categories.On this basis, a TF-IDF SPA module with a simple structure and a convolution layer are stacked together to build a facial expression recognition network.The experimental results show that the network has good performance in facial expression recognition, and achieves a classification accuracy of 73.52% and 98.27% on the FER2013 and CK+ datasets, respectively.

Key words: expression recognition, FER2013 dataset, CK+ dataset, term frequency-inverse document frequency, loss function, attention mechanism

摘要： 面部表情能自然高效地表达人类的心理活动和思想状态，影响着人们的沟通交流过程。在诸多智能化应用中，人脸表情识别是人类与机器间建立情感交互的重要基础。在细粒度人脸表情识别任务中，由于特征提取网络对表情产生区域的关键特征处理不足，从而引发细节特征信息丢失问题。提出一种词频-逆文档频率注意力机制TF-IDF SPA，通过该机制调整表情产生关键区域的注意力分布，强化网络对该区域关键细节特征的提取能力。同时，为了应对表情识别任务中普遍存在的类间差异小、类内差异大的问题，设计一种改进型混合加权损失函数，以增强表情类内聚拢性同时增大类间距离。依据数据集中样本的数量分布情况，动态调整损失函数的分类权重值，从而强化模型对小数据量样本的学习能力。在此基础上，将结构简单的TF-IDF SPA模块与卷积层共同堆叠以构建人脸表情识别网络。实验结果表明，该网络具有较好的人脸表情识别性能，在FER2013和CK+数据集上的分类准确率分别达到73.52%和98.27%。

关键词: 表情识别, FER2013数据集, CK+数据集, 词频-逆文档频率, 损失函数, 注意力机制

CLC Number:

TP391

LAN Zhengjie, WANG Lie, NIE Xiong. An Expression Recognition Algorithm Based on Term Frequency-Inverse Document Frequency and Hybrid Loss[J]. Computer Engineering, 2023, 49(1): 295-302,310.

蓝峥杰, 王烈, 聂雄. 一种基于词频‐逆文档频率和混合损失的表情识别算法[J]. 计算机工程, 2023, 49(1): 295-302,310.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0063455

http://www.ecice06.com/EN/Y2023/V49/I1/295

Figures/Tables 13

References

[1] WANG K, PENG X J, YANG J F, et al.Suppressing uncertainties for large-scale facial expression recognition[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:6896-6905.
[2] 徐琳琳, 张树美, 赵俊莉.构建并行卷积神经网络的表情识别算法[J].中国图象图形学报, 2019, 24(2):227-236. XU L L, ZHANG S M, ZHAO J L.Expression recognition algorithm for parallel convolutional neural networks[J].Journal of Image and Graphics, 2019, 24(2):227-236.(in Chinese)
[3] VERMA M, KOBORI H, NAKASHIMA Y, et al.Facial expression recognition with skip-connection to leverage low-level features[C]//Proceedings of IEEE International Conference on Image Processing.Washington D.C., USA:IEEE Press, 2019:51-55.
[4] CHEN Y, LIU S.Deep partial occlusion facial expression recognition via improved CNN[C]//Proceedings of International Symposium on Visual Computing.Berlin, Germany:Springer, 2020:451-462.
[5] 陈昌川, 王海宁, 黄炼, 等.一种基于局部表征的面部表情识别算法[J].西安电子科技大学学报, 2021(5):100-109. CHEN C C, WANG H N, HUANG L, et al.Facial expression recognition based on local representation[J].Journal of Xidian University, 2021(5):100-109.(in Chinese)
[6] CAI J, MENG Z B, KHAN A S, et al.Island Loss for learning discriminative features in facial expression recognition[C]//Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition.Washington D.C., USA:IEEE Press, 2018:302-309.
[7] 史浩, 邢瑜航, 陈炼.基于多尺度融合注意力机制的人脸表情识别研究[J].微电子学与计算机, 2022, 39(3):34-40. SHI H, XING Y H, CHEN L.Facial expression recognition based on multi-scale feature fusion and attention mechanism[J].Microelectronics & Computer, 2022, 39(3):34-40.(in Chinese)
[8] LI S, DENG W H, DU J P.Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:2584-2593.
[9] NGUYEN T V, ZHAO Q, YAN S C.Attentive systems:a survey[J].International Journal of Computer Vision, 2018, 126(1):86-110.
[10] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is all you need[EB/OL].[2021-11-05].https://arxiv.org/pdf/1706.03762.pdf.
[11] GUO J D, MA X, SANSOM A, et al.SPANet:spatial pyramid attention network for enhanced image recognition[C]//Proceedings of IEEE International Conference on Multimedia and Expo.Washington D.C., USA:IEEE Press, 2020:1-6.
[12] 李迪, 计春雷, 刘松.基于异构分类器集成学习的情感分类方法研究[J].武汉大学学报(工学版), 2021, 54(10):975-982. LI D, JI C L, LIU S.Research on sentiment classification method based on ensemble learning of heterogeneous classifiers[J].Engineering Journal of Wuhan University, 2021, 54(10):975-982.(in Chinese)
[13] WEN Y, ZHANG K, LI Z, et al.A discriminative feature learning approach for deep face recognition[C]//Proceedings of 2016 European Conference on Computer Visio.Berlin, Germany:Springer, 2016:499-515.
[14] WANG F, CHENG J, LIU W Y, et al.Additive margin softmax for face verification[J].IEEE Signal Processing Letters, 2018, 25(7):926-930.
[15] LUCEY P, COHN J F, KANADE T, et al.The extended Cohn-Kanade dataset(CK+):a complete dataset for action unit and emotion-specified expression[C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2010:94-101.
[16] BENGIO Y, GRANDVALET Y.No unbiased estimator of the variance of K-fold cross-validation[J].Journal of Machine Learning Research, 2004, 5(Sep):1089-1105.
[17] GOODFELLOW I J, ERHAN D, LUC CARRIER P, et al.Challenges in representation learning:a report on three machine learning contests[J].Neural Networks, 2015, 64:59-63.
[18] 苏志明, 王烈, 蓝峥杰.基于多尺度分层双线性池化网络的细粒度表情识别模型[J].计算机工程, 2021, 47(12):299-307, 315. SU Z M, WANG L, LAN Z J.Fine-grained expression recognition model based on multi-scale hierarchical bilinear pooling network[J].Computer Engineering, 2021, 47(12):299-307, 315.(in Chinese)
[19] HU J, SHEN L, SUN G.Squeeze-and-excitation networks[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7132-7141.
[20] MENG Z B, LIU P, CAI J, et al.Identity-aware convolutional neural network for facial expression recognition[C]//Proceedings of the 12th IEEE International Conference on Automatic Face & Gesture Recognition.Washington D.C., USA:IEEE Press, 2017:558-565.
[21] LOPES A T, DE AGUIAR E, DE SOUZA A F, et al.Facial expression recognition with convolutional neural networks:coping with few data and the training sample order[J].Pattern Recognition, 2017, 61:610-628.
[22] YANG H Y, CIFTCI U, YIN L J.Facial expression recognition by de-expression residue learning[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:2168-2177.
[23] WANG Y, WANG J C, LI Y J, et al.Facial expression recognition with fused handcraft features based on pixel difference local directional number pattern[J].Journal of Intelligent & Fuzzy Systems, 2021, 41(1):113-123.
[24] LIU X Q, ZHOU F Y.Improved curriculum learning using SSM for facial expression recognition[J].The Visual Computer, 2020, 36(8):1635-1649.
[25] SHI C P, TAN C, WANG L G.A facial expression recognition method based on a multibranch cross-connection convolutional neural network[J].IEEE Access, 2021, 9:39255-39274.
[26] 冯杨, 刘蓉, 鲁甜.基于小尺度核卷积的人脸表情识别[J].计算机工程, 2021, 47(4):262-267. FENG Y, LIU R, LU T.Facial expression recognition based on small-scale kernel convolution[J].Computer Engineering, 2021, 47(4):262-267.(in Chinese)
[27] CONNIE T, AL-SHABI M, CHEAH W P, et al.Facial expression recognition using a hybrid CNN-SIFT aggregator[EB/OL].[2021-11-05].https://arxiv.org/ftp/arxiv/papers/1608/1608.02833.pdf.

Please choose a citation manager

Content to export