作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于状态异步DBN的语音驱动面部动画合成

赵 勇1,蒋冬梅1,Sahli Hichem 2   

  1. (1. 西北工业大学计算机学院,西安 710072;2. 布鲁塞尔自由大学电子与信息工程系,比利时 布鲁塞尔 1050)
  • 收稿日期:2013-01-01 出版日期:2014-02-15 发布日期:2014-02-13
  • 作者简介:赵 勇(1988-),男,硕士研究生,主研方向:可视语音合成;蒋冬梅、Sahli Hichem,教授
  • 基金资助:
    国家自然科学基金资助项目(61273265);陕西省国际科技合作基金资助重点项目(2011KW-04)

Speech Driven Facial Animation Synthesis Based on State Asynchronous DBN

ZHAO Yong  1, JIANG Dong-mei   1, Sahli Hichem   2   

  1. (1. School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China 2. ETRO Department, Vrije Universiteit Brussel, Brussels 1050, Belgium)
  • Received:2013-01-01 Online:2014-02-15 Published:2014-02-13

摘要: 提出一种基于状态异步动态贝叶斯网络模型(SA-DBN)的语音驱动面部动画合成方法。提取音视频语音数据库中音频的感知线性预测特征和面部图像的主动外观模型(AAM)特征来训练模型参数,对于给定的输入语音,基于极大似然估计原理学习得到对应的最优AAM特征序列,并由此合成面部图像序列和面部动画。对合成面部动画的主观评测结果表明,与听视觉状态同步的DBN模型相比,通过限制听觉语音状态和视觉语音状态间的最大异步程度,SA-DBN可以得到清晰自然并且嘴部运动与输入语音高度一致的面部动画。

关键词: 面部动画合成, 状态异步动态贝叶斯网络模型, 异步约束, 主动外观模型, 感知线性预测, 极大似然估计

Abstract: An audio visual Dynamic Bayesian Network model with State Asynchrony(SA-DBN) transforming acoustic speech to photo realistic facial animation is proposed. Perceptual Linear Prediction(PLP) features from audio speech, as well as Active Appearance Model(AAM) features from face images of an audio visual speech database, are adopted to train the model parameters of the proposed SA-DBN. Based on the SADBN model, an input audio stream is given, the optimal AAM visual features are learned by the Maximum Likelihood Estimation(MLE) criterion, which are used to construct facial images for the animation. Subjective evaluation is presented to compare the proposed constrained state asynchrony DBN with a state synchronous audio visual DBN model. Experimental results show that with the SA-DBN model, high quality facial animations can be obtained with mouth movements matching the input speech.

Key words: facial animation synthesis, Dynamic Bayesian Network model with State Asynchrony(SA-DBN), asynchrony constraint, Active Appearance Model(AAM), Perceptual Linear Prediction(PLP), Maximum Likelihood Estimation(MLE)

中图分类号: