作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (5): 81-89,96. doi: 10.19678/j.issn.1000-3428.0064241

• 人工智能与模式识别 • 上一篇    下一篇

融合多尺度特征的脑电情感识别研究

焦义, 徐华兴, 毛晓波, 李楠, 姚国梁, 倪金红, 徐向阳   

  1. 郑州大学 电气工程学院, 郑州 450001
  • 收稿日期:2022-03-21 修回日期:2022-05-14 发布日期:2022-08-19
  • 作者简介:焦义(1995-),男,硕士研究生,主研方向为脑电信号识别;徐华兴,副教授、博士;毛晓波(通信作者),教授、博士、博士生导师;李楠,博士研究生;姚国梁、倪金红、徐向阳,硕士研究生。
  • 基金资助:
    国家重点研发计划(2020YFC2006100);中央本级重大增减支项目(2060302-1802-03)。

Research on EEG Emotion Recognition with Fusion of Multi-Scale Features

JIAO Yi, XU Huaxing, MAO Xiaobo, LI Nan, YAO Guoliang, NI Jinhong, XU Xiangyang   

  1. School of Electrical Engineering, Zhengzhou University, Zhengzhou 450001, China
  • Received:2022-03-21 Revised:2022-05-14 Published:2022-08-19

摘要: 基于单尺度二维、三维卷积的脑电情感识别算法存在原始信号映射到高维特征矩阵过程中信息易丢失、模型参数量大、提取特征相对单一等问题。提出多尺度金字塔交互注意力残差网络(MPIAResnet)。利用多尺度一维卷积核直接提取原始脑电信号的多尺度空间特征,将标准卷积替换为分组卷积,相比二维、三维卷积具有更少的参数量,同时利用通道交互注意力机制优化特征提取过程。在此基础上,与双向GRU(BiGRU)融合组成MPIAResnet-BiGRU网络,进一步提取脑电信号的上下文语义信息,实现脑电信号的时空特征融合。基于公开数据集DEAP的实验结果表明:在受试者依赖实验中,该模型Valence和Arousal维度识别准确率达到97.60%和98.15%,相比单尺度模型提升8.56和8.36个百分点;在小批量训练集实验中,当训练集占比为30%时,测试集准确率依然可以保持在90%以上;在分频带实验中,2个高频带信号识别准确率优于低频带信号,证明了模型的有效性;而在受试者全部参与实验中,该模型的识别准确率也均优于对比方法。

关键词: 脑电信号, 多尺度卷积, 注意力机制, 双向GRU, 时空特征

Abstract: Electroencephalogram(EEG) emotion recognition algorithms based on two- and three-dimensional convolutions have several shortcomings,such as loss of information in the process of mapping the original signal to a high-dimensional feature matrix,large number of model parameters,and relatively poor diversity of extracted features.To address the above problems,this paper proposes a Multiscale Pyramid Interactive Attention Residual Convolution Network(MPIAResnet) using a multi-scale one-dimensional convolution kernel to directly extract multi-scale spatial features from the original EEG signals.The standard convolution is replaced by a grouped convolution,which has fewer parameters than two- and three-dimensional convolutions,and feature extraction is optimized using the channel attention mechanism. Finally,it is combined with Bidirectional GRU(BiGRU) to formulate the MPIAResnet-BiGRU network,which further extracts the contextual semantic information of EEG signals to fuse their spatiotemporal features. Experiments are conducted using the open dataset DEAP.In the subject dependent experiment,the recognition accuracy of the model in the Valence and Arousal dimensions reaches 97.60% and 98.15%,respectively,which are 8.56 and 8.36 percentage points higher than those of the single scale model. In small batch training set experiments,when the training set proportion is 30%,the test set accuracy can still be maintained at over 90%.In the frequency division band experiment,the recognition accuracy of two high-frequency band signals was optimized compared to low-frequency band signals,which proves the effectiveness of the model.In the experiments involving all subjects,the proposed model achieves superior recognition accuracy to the comparison method.

Key words: Electroencephalogram(EEG) signal, multi-scale convolution, attention mechanism, Bidirectional GRU(BiGRU), spatio-temporal features

中图分类号: