Abstract:
This paper processes speech feature extraction based on wavelet packet analysis aiming at speech-driven lip movement synthesize. It uses feature difference and multi-frames speech based on association relationship of lip frames to express dynamic characteristic for speech, utilizes Principal Component Analysis(PCA) to reduce dimensions of the input speech. It introduces speech-visual mapping models based on Input-Output Hidden Markov Model(IOHMM) to obtain speech-driven lip movement synthesize system. Experiment indicates that speech features are more robust than traditional Mel-frequency cepstrum coefficient, can synthesize coherent and natural lip sequences.
Key words:
visual speech,
wavelet packet analysis,
Principal Component Analysis(PCA)
摘要: 针对语音驱动的唇动合成系统进行基于小波包分析的语音特征提取,采用特征差分和口形帧前后关联的多帧语音表征语音的动态特性,利用主成分分析降低输入语音的特征维数。采用基于输入输出隐马尔可夫模型(IOHMM)的音视频映射模型构建语音驱动唇动合成系统,实验表明提取的语音参数比传统Mel倒谱系数鲁棒性更好,合成的口形序列更连贯、自然。
关键词:
可视语音,
小波包分析,
主成分分析
CLC Number:
MA E-e; LIU Ying; WANG Cheng-ru. Speech-driven Lip Movement Synthesize System Based on IOHMM[J]. Computer Engineering, 2009, 35(18): 283-285.
马娥娥;刘 颖;王成儒. 基于IOHMM的语音驱动唇动合成系统[J]. 计算机工程, 2009, 35(18): 283-285.