作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (23): 201-203. doi: 10.3969/j.issn.1000-3428.2009.23.069

• 人工智能及识别技术 • 上一篇    下一篇

汉语普通话易混淆音素的识别

李晨冲1,2,董 滨2,潘复平2,曾兴雯1,颜永红2   

  1. (1. 西安电子科技大学通信工程学院,西安 710071;2. 中国科学院声学研究所中科信利语音实验室,北京 100190)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-12-05 发布日期:2009-12-05

Recognition of Easily Confused Mandarin Phone

LI Chen-chong1,2, DONG Bin2, PAN Fu-ping2, ZENG Xing-wen1, YAN Yong-hong2   

  1. (1. School of Telecommunication Engineering, Xidian University, Xi’an 710071; 2. ThinkIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-12-05 Published:2009-12-05

摘要: 针对汉语普通话语音识别中易混淆音素的声学特征,把小波包分解理论应用在感觉加权线性预测(PLP)特征中,提出一种新的特征参数提取算法,可以更精确地描述易混淆音素的频谱特征。使用高斯混合模型对新的声学特征进行分类,从而达到区分的目的。实验结果证明,新的特征参数识别结果优于使用传统PLP特征参数的识别结果,识别错误率下降30%以上。

关键词: 小波包分解, 感觉加权线性预测, 语音识别

Abstract: Aiming at the acoustic features of some easily confused mandarin speech recognition, this paper directs towards revising the Perceptual Linear Predictive(PLP) acoustic feature of these consonants by applying wavelet packet decomposition theory, in which a new feature extraction algorithm is proposed. The new feature can describe frequency spectrum of the easily confused phones more accurately. It uses Gaussian Mixture Modeling(GMM) to classify the new feature for phone discrimination. Experimental results show that the distinguishing error rates of those easily confused consonants are decreased greatly more than 30% compared with traditional PLP feature.

Key words: wavelet packet decomposition, Perceptual Linear Predictive(PLP), speech recognition

中图分类号: