作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (5): 288-304. doi: 10.19678/j.issn.1000-3428.0068949

• 图形图像处理 • 上一篇    下一篇

基于量子化降噪自编码器的遮挡微表情重建方法研究

刘慧1,2, 郭特1,2, 刘栋1,2, 李颖颖3   

  1. 1. 河南师范大学计算机与信息工程学院, 河南 新乡 453007;
    2. 河南省高校计算智能与数据挖掘工程技术研究中心, 河南 新乡 453007;
    3. 战略支援部队信息工程大学网络空间安全学院, 河南 郑州 450002
  • 收稿日期:2023-12-04 修回日期:2024-02-26 出版日期:2025-05-15 发布日期:2024-06-03
  • 通讯作者: 刘栋,E-mail:liudong@htu.edu.cn E-mail:liudong@htu.edu.cn
  • 基金资助:
    河南省重大科技专项(221100210600)。

Research on Reconstruction Method of Occluded Micro-Expressions Based on Quantized Denoising Autoencoder

LIU Hui1,2, GUO Te1,2, LIU Dong1,2, LI Yingying3   

  1. 1. College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, Henan, China;
    2. Engineering Technology Research Center for Computing Intelligence and Data Mining of Henan Colleges, Xinxiang 453007, Henan, China;
    3. School of Cybersecurity, Strategic Support Force Information Engineering University, Zhengzhou 450002, Henan, China
  • Received:2023-12-04 Revised:2024-02-26 Online:2025-05-15 Published:2024-06-03

摘要: 微表情是一种心理健康诊断的重要依据,眼镜、口罩等物体造成的遮挡会导致微表情识别困难。现有遮挡微表情重建方法以RGB纹理信息重建为主,存在信息大量冗余、难以实现对纹理的精确重建等问题。此外,重建方法采用的模型多为基于U-Net的对称自编码器和生成对抗网络(GAN)等,存在浅层的对称结构重建能力有限、对抗损失收敛困难等问题。为此,提出一种基于量子化降噪自编码器的微表情遮挡区域动态流特征重建方法。首先,基于光流和动态图像提出光照能量鲁棒的动态流特征表示,有效聚合所有TVL1光流中的运动信息,并简化纹理信息;其次,基于离散编码的变分自编码器(VQ-VAE)提出一种双层结构向量量子化降噪自编码器(VQ-DAE),用于微表情的遮挡区域动态流特征重建,以进行遮挡微表情的识别。实验结果表明,该方法能较好地重建遮挡区域的运动信息,在CASME、CAS(ME)2 、CASME Ⅱ这3个数据集上的准确率分别达到77.89%、72.02%、61.04%。与传统方法、基于空间注意力及自注意力方法相比,所提方法在准确率、未加权平均召回率(UAR)、Macro-F1等指标上均有显著的性能提升。

关键词: 遮挡微表情识别, 特征重建, 光流, 动态图像, 降噪自编码器

Abstract: Micro-expressions serve as a crucial basis for psychological health diagnoses, and occlusions caused by objects, such as glasses or masks, can make micro-expression recognition challenging. Existing reconstruction methods for occluded micro-expressions rely primarily on reconstructing RGB texture information, which leads to issues such as information redundancy and difficulties in achieving precise texture reconstruction. In addition, the models used in such reconstruction methods often involve symmetric autoencoders based on U-Net and Generative Adversarial Networks (GAN); However, the former suffers from limited reconstruction capabilities in shallow symmetric structures, and the latter faces challenges in terms of adversarial loss convergence speed. This paper proposes a method for reconstructing dynamic flow features in occluded regions of micro-expressions based on a vector-quantized denoising autoencoder. First, dynamic flow, a robust feature representation resilient to lighting variations, is proposed based on optical flow and dynamic images, effectively aggregating motion information from all TVL1 optical flows and simplifying the texture information. Then, a two-pair Vector Quantized Denoising Autoencoder (VQ-DAE) based on the discrete encoding Vector Quantized Variational Autoencoder (VQ-VAE) is introduced to reconstruct dynamic flow features in occluded regions of micro-expressions to facilitate the recognition of occluded micro-expressions. Experimental results demonstrate that this approach effectively reconstructs motion information in occluded regions, achieving accuracy rates of 77.89%, 72.02%, and 61.04% on the CASME, CAS(ME)2, and CASME Ⅱ datasets, respectively. Compared to traditional, spatial-attention-, and self-attention-based methods, our method leads to significant improvements in accuracy, Unweighted Average Recall (UAR), and Macro-F1.

Key words: occluded micro-expression recognition, feature reconstruction, optical flow, dynamic image, denoising autoencoder

中图分类号: