摘要: 以维吾尔语为例研究自然语料缺乏的民族语言连续语音识别方法。采用HTK通过人工标注的少量语料生成种子模型,引导大语音数据构建声学模型,利用palmkit工具生成统计语言模型,以Julius工具实现连续语音识别。实验用64个维语母语者自由发话的6 400个 短句语音建立单音素声学模型,由100 MB文本、6万词词典生成基于词类的3-gram语言模型,测试结果表明,该方法的识别率为 72.5%,比单用HTK提高4.2个百分点。
关键词:
连续语音识别,
种子模型,
声学模型,
语言模型,
维吾尔语
Abstract: This paper discusses a continuous speech recognition approach for the resource-deficient languages, such as Uyghur. This approach tries a seed acoustic model using small training speech materials and creates final acoustic model using a larger speech materials and its transcription text by leading seed model. HTK is used for training acoustic model, and palmkit is used for creating language model, and the open-source speech recognition software Julius is applied for continuous speech recognition. In the test, the speech data of 6 400 sentences uttered by 64 native Uyghur speakers is used for training acoustic model and 100 MB text materials and a dictionary of 60 000 words are used for creating 3-garm language model based class. Experimental results show the rate of 72.5% for the real time sound recognition compared with the recognition result of 68.3% by HTK tool only.
Key words:
continuous speech recognition,
seed model,
acoustic model,
language model,
Uyghur
中图分类号:
武晓敏, 达瓦?伊德木草, 吾守尔?斯拉木. 自然语料缺乏的民族语言连续语音识别[J]. 计算机工程, 2012, 38(12): 129-131.
WU Xiao-Min, DA Wa-?Yi-De-Mu-Cao, WU Shou-Er-?Shi-La-Mu. Continuous Speech Recognition for Natural Resource-deficient Minority Languages[J]. Computer Engineering, 2012, 38(12): 129-131.