计算机工程 ›› 2012, Vol. 38 ›› Issue (5): 189-191.doi: 10.3969/j.issn.1000-3428.2012.05.058

• 人工智能及识别技术 • 上一篇    下一篇

藏语拉萨话大词表连续语音识别声学模型研究

李冠宇1,孟 猛2   

  1. (1. 西北民族大学中国民族信息技术研究院,兰州 730030; 2. 中国科学院自动化研究所数字内容技术与系统研究中心,北京 100190)
  • 收稿日期:2011-11-08 出版日期:2012-03-05 发布日期:2012-03-05
  • 作者简介:李冠宇(1973-),男,讲师、硕士,主研方向:模式识别,中文信息处理;孟 猛,助理研究员、博士
  • 基金项目:
    国家自然科学基金资助项目(60970071);中央高校基本科研业务费专项基金资助项目(zyz2011100、ycx11009)

Research on Acoustic Model of Large-vocabulary Continuous Speech Recognition for Lhasa Tibetan

LI Guan-yu 1, MENG Meng 2   

  1. (1. China Minorities Information Technology Institute, Northwest University for Nationalities, Lanzhou 730030, China; 2. Digital Content Technology and System Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China)
  • Received:2011-11-08 Online:2012-03-05 Published:2012-03-05

摘要: 根据藏语的特点,提出藏语拉萨话大词表连续语音识别声学模型,利用高层次的藏语语言知识减少模式匹配的模糊性。以音素和声韵母为声学建模单元,在HTK平台上建立上下文相关的连续隐马尔可夫声学模型,以实现藏语拉萨话特定人大词表连续语音识别。实验结果表明,在最优情况下,该模型词错误率只有7.8%。

关键词: 藏语, 拉萨话, 连续语音识别, 隐马尔可夫模型, HTK工具, 声学模型

Abstract: The characteristics of Tibetan are analyzed in this paper. The framework of auto speech recognition of Lhasa dialect is designed. Several feasible units for acoustic models are analyzed. Contextual continuous Hidden Markov Model(HMM) models based on phonemes and semi-syllables are established and trained on Hidden Markov Model Toolkit(HTK) platform respectively and large-vocabulary continuous speech recognition of Lhasa Tibetan is implemented. Experimental results show that Word Error Rate(WER) is 7.8% in the best case.

Key words: Tibetan, Lhasa, continuous speech recognition, Hidden Markov Model(HMM), Hidden Markov Model Toolkit(HTK), acoustic model

中图分类号: