作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2026, Vol. 52 ›› Issue (6): 403-413. doi: 10.19678/j.issn.1000-3428.0070243

• 交叉融合与工程应用 • 上一篇    下一篇

EBP-Net: 多尺度注意力约束的人脸视频血压预测

周运梁1, 何源夏1, 冯子麒1, 郑乃弋1, 徐晓刚1,*(), 徐冠雷1, 陈少辉2   

  1. 1. 浙江工商大学计算机科学与技术学院, 浙江 杭州 310000
    2. 浙江工业大学之江学院信息工程学院, 浙江 杭州 310000
  • 收稿日期:2024-08-12 修回日期:2024-10-11 出版日期:2026-06-15 发布日期:2024-12-30
  • 通讯作者: 徐晓刚
  • 作者简介:

    周运梁, 男, 硕士研究生, 主研方向为计算机视觉

    何源夏, 硕士

    冯子麒, 硕士

    郑乃弋, 硕士

    徐晓刚(通信作者), 教授、博士

    徐冠雷, 副教授、博士

    陈少辉, 讲师、博士

  • 基金资助:
    国家重点研发计划(2023YFC3306203)

EBP-Net: Blood Pressure Prediction from Face Videos using Multi-Scale Attention Constraint

ZHOU Yunliang1, HE Yuanxia1, FENG Ziqi1, ZHENG Naiyi1, XU Xiaogang1,*(), XU Guanlei1, CHEN Shaohui2   

  1. 1. College of Computer Science and Technology, Zhejiang Gongshang University, Hangzhou 310000, Zhejiang, China
    2. College of Information Engineering, Zhijiang College of Zhejiang University of Technology, Hangzhou 310000, Zhejiang, China
  • Received:2024-08-12 Revised:2024-10-11 Online:2026-06-15 Published:2024-12-30
  • Contact: XU Xiaogang

摘要:

血压作为评估心血管健康的关键指标, 在日常居家血压监测中, 除传统的血压仪外, 目前研究人员采用的主流方式仍为使用多个生理信号等一些非端到端测量方式, 这些方式存在采集多个生理信号较为困难且成本昂贵的问题, 另外难以保持采集信号的时间同步性。另一种是现存的利用人脸视频预测血压, 即端到端的方式, 这种方式在一定程度上拓宽了适用场景, 但在感兴趣区域的选取、预测准确度等方面仍存在问题。为解决这些问题, 提出一种融合多尺度注意力结构在可见光场景下远程血压预测方法。首先对每个人脸视频候选窗口进行分类和回归提取有效皮肤区域, 并利用基于光流的技术从连续的有效人脸区域中提取远程光电容积描记法(rPPG)信号, 并将完整的rPPG信号通过小波变换滤波、去趋势等方式提取稳健rPPG信号。其次提出的EBP-Net引入并改进一种新的高效多尺度注意力(EMA)模块和多尺度融合(MSF)模块, 不仅能够在不降低通道维度的情况下增强深度视觉表示的特征, 而且可以通过多尺度特征的捕捉和层次化表达, 显著提升模型对生理信号的理解能力和预测能力。实验结果表明, 收缩压(SBP)在两个数据集上的表现已达到英国高血压协会(BHS)C级标准, 同时, 舒张压(DBP)达到B级标准, 收缩压和舒张压的平均绝对误差(MAE)分别达到6.82 mmHg和5.17 mmHg, 低于近期同等研究结果。与其他模型相比, 提出方法更具有泛化能力和更低的误差, 可为人脸血压水平检测提供有效方法和建议。

关键词: 多尺度注意力, 脉动信号, 血压估算, 光流, 远程光电容积描记法信号

Abstract:

Blood pressure is a key indicator of cardiovascular health. To monitor daily blood pressure at home, researchers continue to use non-end-to-end measurement methods, such as combining multiple physiological signals, in addition to traditional blood pressure meters. These methods present some disadvantages: collecting multiple physiological signals is difficult and expensive, and maintaining the time synchronization of signal collection is challenging. Currently, the end-to-end method, which uses face videos to predict blood pressure, has widened the application areas to an extent; however, in most methods, selecting the areas of interest and limited prediction accuracy remain unresolved issues. To solve these problems, this paper proposes a remote blood pressure prediction method based on a multi-scale attention structure in visible light, called EBP-Net. First, each face video candidate window is classified, the effective skin region is extracted by regression, the remote Photoplethysmography (rPPG) signal is extracted from the continuous effective face region by optical flow-based technology, and the complete rPPG signal is filtered by wavelet transform. Robust rPPG signals are extracted using methods such as detrending. Second, EBP-Net introduces a new Efficient Multi-scale Attention (EMA) module and Multi-Scale Fusion (MSF) module, which can enhance the features of depth vision representation without reducing the channel dimension and significantly improve the model's ability to understand and predict physiological signals by capturing and hierarchically expressing multi-scale features. In experiments, Systolic Blood Pressure (SBP) is categorized as grade C according to the British Hypertension Society (BHS) on both datasets, while Diastolic Blood Pressure (DBP) is categorized as grade B. The Mean Absolute Error (MAE) values are 6.82 and 5.17 mmHg for systolic and DBP, respectively, which are lower than those in recent comparable studies. Compared with other models, this method has better generalization ability, a lower error rate, and provides effective methods and suggestions for the detection blood pressure level from face videos.

Key words: multi-scale attention, pulsating signal, blood pressure estimation, optical flow, remote Photoplethysmography(rPPG) signal