Abstract:
Aiming at the acoustic features of some easily confused mandarin speech recognition, this paper directs towards revising the Perceptual Linear Predictive(PLP) acoustic feature of these consonants by applying wavelet packet decomposition theory, in which a new feature extraction algorithm is proposed. The new feature can describe frequency spectrum of the easily confused phones more accurately. It uses Gaussian Mixture Modeling(GMM) to classify the new feature for phone discrimination. Experimental results show that the distinguishing error rates of those easily confused consonants are decreased greatly more than 30% compared with traditional PLP feature.
Key words:
wavelet packet decomposition,
Perceptual Linear Predictive(PLP),
speech recognition
摘要: 针对汉语普通话语音识别中易混淆音素的声学特征,把小波包分解理论应用在感觉加权线性预测(PLP)特征中,提出一种新的特征参数提取算法,可以更精确地描述易混淆音素的频谱特征。使用高斯混合模型对新的声学特征进行分类,从而达到区分的目的。实验结果证明,新的特征参数识别结果优于使用传统PLP特征参数的识别结果,识别错误率下降30%以上。
关键词:
小波包分解,
感觉加权线性预测,
语音识别
CLC Number:
LI Chen-chong; DONG Bin; PAN Fu-ping; ZENG Xing-wen; YAN Yong-hong. Recognition of Easily Confused Mandarin Phone[J]. Computer Engineering, 2009, 35(23): 201-203.
李晨冲;董 滨;潘复平;曾兴雯;颜永红. 汉语普通话易混淆音素的识别[J]. 计算机工程, 2009, 35(23): 201-203.