基于多尺度分层双线性池化网络的细粒度表情识别模型

doi:10.19678/j.issn.1000-3428.0060133

摘要/Abstract

摘要： 人脸表情细微的类间差异和显著的类内变化增加了人脸表情识别难度。构建一个基于多尺度双线性池化神经网络的识别模型。设计3种不同尺度网络提取人脸表情全局特征，并引入分层双线性池化层，集成多个同一网络及不同网络的多尺度跨层双线性特征以捕获不同层级间的部分特征关系，从而增强模型对面部表情细微特征的表征及判别能力。同时，使用逐层反卷积融合多层特征信息，解决神经网络通过多层卷积层、池化层提取特征时丢失部分关键特征的问题。实验结果表明，该模型在FER2013和CK+公开数据集上的识别率分别为73.725%、98.28%，优于SLPM、CL、JNS等人脸表情识别模型。

关键词: 卷积神经网络, 细粒度表情识别, 多尺度网络, 分层双线性池化, 多层特征融合

Abstract: Facial expressions are characterized by subtle differences between expression classes and significant changes within a class, which increases the difficulty of expression recognition.To address the problem, a neural network model is proposed based on multi-scale bilinear pooling.The global features of facial expressions are extracted by using three networks with different scales.Then a hierarchical bilinear pooling layer is introduced, and multi-scale cross-layer bilinear features of the same network and different networks are integrated to capture some feature relationships between different levels, thus enhancing the ability of the model to represent and recognize subtle features of facial expressions. Multilayer feature information is fused by layer deconvolution, so the loss of key features that occurs when the neural network extracts features through multiple convolution layers and the pooling layer is solved.The experimental results show that the proposed model achieves a 73.725% recognition accuracy on FER2013 and 98.82% on CK+public data sets, outperforming SPLM, CL, JNS and other facial expression recognition algorithms.

Key words: Convolution Neural Network(CNN), fine-grained expression recognition, multiple-scale network, hierarchical bilinear pooling, multilayer feature fusion

中图分类号:

TP391

苏志明, 王烈, 蓝峥杰. 基于多尺度分层双线性池化网络的细粒度表情识别模型[J]. 计算机工程, 2021, 47(12): 299-307,315.

SU Zhiming, WANG Lie, LAN Zhengjie. Fine-Grained Expression Recognition Model Based on Multi-Scale Hierarchical Bilinear Pooling Network[J]. Computer Engineering, 2021, 47(12): 299-307,315.

https://www.ecice06.com/CN/Y2021/V47/I12/299

图/表 20

20211214183020

20211214183024

20211214183028

20211214183034

20211214183037

20211214183041

20211214183044

20211214183047

20211214183051

20211214183054

20211214183059

20211214183103

20211214183106

20211214183110

20211214183114

20211214183118

20211214183121

20211214183125

20211214183129

20211214183132

参考文献

[1] LI S, DENG W.Deep facial expression recognition:a survey[J].IEEE Transactions on Affective Computing, 2018, 3(9):1-10.
[2] WOLD S, ESBENSEN K, GELADI P.Principal component analysis[J].Chemometrics and Intelligent Laboratory Systems, 1987, 2(3):37-52.
[3] OJALA T, PIETIKAINEN M, HARWOOD D.A comparative study of texture measures with classification based on featured distributions[J].Pattern Recognition, 1996, 29(1):51-59.
[4] COOTES T F, TAYLOR C J, COOPER D H, et al.Active shape models-their training and application[J].Computer Vision and Image Understanding, 1995, 61(1):38-59.
[5] COOTES T F, EDWARDS G J, TAYLOR C J.Comparing active shape models with active appearance models[EB/OL].[2020].https://www.researchgate.net/publication/221259802_Comparing_Active_Shape_Models_with_Active_Appearance_Models.
[6] 杨旭, 尚振宏.基于改进AlexNet的人脸表情识别[J].激光与光电子学进展, 2020, 57(14):243-250. YANG X, SHANG Z H.Facial expression recognition based on improved AlexNet[J].Advances in Laser and Optoelectronics, 2020, 57(14):243-250.(in Chinese)
[7] 冯杨.基于小尺度核卷积的人脸表情识别研究[D].武汉:华中师范大学, 2020. FENG Y.Facial expression recognition based on small-scale kernel convolution[D].Wuhan:Central China Normal University, 2020.(in Chinese)
[8] LIU X Q, ZHOU F Y.Improved curriculum learning using SSM for facial expression recognition[J].The Visual Computer, 2020, 36(6):1-15.
[9] 李勇, 林小竹, 蒋梦莹.基于跨连接LeNet-5网络的面部表情识别[J].自动化学报, 2018, 44(1):176-182. LI Y, LIN X Z, JIANG M Y.Facial expression recognition based on cross-connect LeNet-5 network[J].Acta Automatica Sinica, 2018, 44(1):176-182.(in Chinese)
[10] 张爱梅, 徐杨.注意力分层双线性池化残差网络的表情识别[J].计算机工程与应用, 2020, 56(23):161-166. ZHANG A M, XU Y.Attention hierarchical bilinear pooling residual network for expression recognition[J].Computer Engineering and Applications, 2020, 56(23):161-166.(in Chinese)
[11] HE K M, ZHANG X Y, REN S Q, et al.Delving deep into rectifiers:surpassing human-level performance on imagenet classification[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:1026-1034.
[12] YU C J, ZHAO X Y, ZHENG Q, et al.Hierarchical bilinear pooling for fine-grained visual recognition[C]//Proceedings of 2018 European Conference on Computer Vision.Berlin, Germany:Springer, 2018:574-589.
[13] KIM J H, ON K W, LIM W, et al.Hadamard product for low-rank bilinear pooling[EB/OL].[2020-10-29].https://arxiv.org/abs/1610.04325v1.
[14] ZHANG H Y, CISSE M, DAUPHIN Y N, et al.Mixup:beyond empirical risk minimization[EB/OL].[2020-10-29].https://arxiv.org/abs/1710.09412.
[15] YANG S Z, GONG Z, YE K, et al.EdgeCNN:convolutional neural network classification model with small inputs for edge computing[EB/OL].[2020-10-29].https://www.researchgate.net/publication/336147679_EdgeCNN_Convolutional_Neural_Network_Classification_Model_with_small_inputs_for_Edge_Computing.
[16] GOODFELLOW I J, ERHAN D, CARRIER P L, et al.Challenges in representation learning:a report on three machine learning contests[C]//Proceedings of 2013 International Conference on Neural Information Processing.Berlin, Germany:Springer, 2013:117-124.
[17] LUCEY P, COHN J F, KANADE T, et al.The extended Cohn-Kanade dataset (CK+):a complete dataset for action unit and emotion-specified expression[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-workshops.Washington D.C., USA:IEEE Press, 2010:94-101.
[18] LUO L C, XIONG Y H, LIU Y, et al.Adaptive gradient methods with dynamic bound of learning rate[EB/OL].[2020-10-29].https://www.researchgate.net/publication/331371132_Adaptive_Gradient_Methods_with_Dynamic_Bound_of_Learning_Rate.
[19] TURAN C, LAM K M, HE X.Soft locality preserving map for facial expression recognition[EB/OL].[2020-10-29].https://arxiv.org/abs/1801.03754.
[20] ZHOU J C, JIA X, SHEN L L, et al.Improved softmax loss for deep learning-based face and expression recognition[J].Cognitive Computation and Systems, 2019, 1(4):97-102.
[21] YANG H Y, CIFTCI U, YIN L J.Facial expression recognition by de-expression residue learning[C]//Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:2168-2177.
[22] TIAN Y, WEN Z W, XIE W C, et al.Outlier-suppressed triplet loss with adaptive class-aware margins for facial expression recognition[C]//Proceedings of 2019 IEEE International Conference on Image Processing.Washington D.C., USA:IEEE Press, 2019:46-50.
[23] SHAO J, QIAN Y S.Three convolutional neural network models for facial expression recognition in the wild[J].Neurocomputing, 2019, 355(25):82-92.
[24] 兰凌强, 李欣, 刘淇缘, 等.基于联合正则化策略的人脸表情识别方法[J].北京航空航天大学学报, 2020, 46(9):1797-1806. LAN L Q, LI X, LIU Q Y, et al.Facial expression recognition method based on a joint normalization strategy[J].Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9):1797-1806.(in Chinese)

选择文件类型/文献管理软件名称

选择包含的内容