[1] 王志明, 蔡莲红, 艾海舟. 基于数据驱动方法的汉语文本可视语音合成[J]. 软件学报, 2005, 16(6): 1054-1063.
[2] 李冰峰, 谢 磊, 周祥增, 等. 实时语音驱动的虚拟说话 人[J]. 清华大学学报: 自然科学版, 2011, 51(9): 1180-1186.
[3] Bregler C, Covell M, Slaney M. Video Rewrite: Driving Visual Speech with Audio[C]//Proc. of SIGGRAPH’97. New York, USA: [s. n.], 1997: 353-360.
[4] Choi K, Luo Y, Hwang J. Hidden Markov Model Inversion for Audio-to-visual Conversion in an MPEG-4 Facial Animation System[J]. Journal of VLSI Signal Processing, 2001, 29(1/2): 51-61.
[5] Terissi L D, Gomez J C. Audio-to-visual Conversion via HMM Inversion for Speech-driven Facial Animation[C]//Proc. of SBIA’08. Brasilia, Brazil: [s. n.], 2008: 33-42.
[6] Gowdy J N, Subramanya A, Bartels C. DBN Based Multi- stream Models for Audio-visual Speech Recognition[C]//Proc. of ICASSP’04. New York, USA: [s. n.], 2004.
[7] Cootes T, Edwards G, Taylor C. Active Appearance Models[C]// Proc. of ECCV’98. Berlin, Germany: [s. n.], 1998: 484-498.
[8] Zhang Yimin, Diao Qian, Huang Shan, et al. DBN Based
Multi-stream Models for Speech[C]//Proc. of ICASSP’03. Beijing, China: [s. n.], 2003: 836-839.
[9] Young S, Evermann G, Kershaw D, et al. The HTK Book[M]. Cambridge, UK: Cambridge University Press, 2002.
[10] Hou Y, Sahli H, Ravyse I, et al. Robust Shape Based Head Tracking[C]//Proc. of Advanced Concepts for Intelligent Vision Systems. Amsterdam, the Nertherland: [s. n.], 2007: 340-351.
[11] AM_TOOLS工具包[EB/OL]. (2012-10-10). http://personal pages.manchester.ac.uk/staff/timothy.f.cootes/software/am_tools_doc/index.html.
[12] Hirsh H G, Pearce D. The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noisy Conditions[C]//Proc. of International Workshop on Automatic Speech Recognition. Paris, France, [s. n.], 2000: 181-188.
[13] Bilmes J, Zweig G. The Graphical Models Toolkit: An Open Source Software System for Speech and Time Series Processing[C]//Proc. of ICASSP’02. New York, USA: [s. n.], 2002: 3916-3919.
编辑 索书志 |