摘要: 为更好地对听视觉情感信息之间的关联关系进行建模,提出一种三流混合动态贝叶斯网络情感识别模型(T_AsyDBN)。采用MFCC特征及基于基频和短时能量的局域韵律特征作为听觉输入流,在状态层同步。将面部几何特征和面部动作参数特征作为视觉输入流,与听觉输入流在状态层异步。实验结果表明,该模型优于有状态异步约束的听视觉双流DBN模型,6种情感的平均识别率从 52.14%提高到63.71%。
关键词:
动态贝叶斯网络,
听视觉融合,
情感识别,
异步约束,
权重
Abstract: This paper presents a triple stream Dynamic Bayesian Networks(DBN) model(T_AsyDBN) for audio visual emotion recognition, in which the two audio streams are synchronous at the state level, while they are asynchronous with the visual stream within controllable constraints. MFCC features and local prosodic features are extracted as audio features, while dimensional geometric features as well facial action units’ coefficients are extracted as visual features. Emotion recognition experiments show that by adjusting the asynchrony % to 63.71%.constraint, T_AsyDBN performs better than the two stream audio visual DBN model(Asy_DBN), with average recognition rate improves from 52.14
Key words:
Dynamic Bayesian Networks(DBN),
audio visual fusion,
emotion recognition,
asynchrony constraint,
weight
中图分类号:
吕兰兰, 蒋冬梅, 王风娜, Hichem Sahli, Werner Verhelst. 基于三流DBN模型的听视觉情感识别[J]. 计算机工程, 2012, 38(5): 161-162,166.
LV Lan-Lan, JIANG Dong-Mei, WANG Feng-Na, Hichem Sahli, Werner Verhelst. Audio Visual Emotion Recognition Based on Triple Stream DBN Model[J]. Computer Engineering, 2012, 38(5): 161-162,166.