摘要: 在语音识别中,连续型隐马尔可夫模型(CHMM)在初始化时采用分段K-means算法,但该算法会导致模型参数收敛于局部最优。针对该问题,提出基于密度和距离参数的CHMM模型初始化算法。计算数据对象的距离和密度参数,选择密度值较大而同时距离较远的数据对象作为初始
聚类中心,对其进行K-means聚类处理,得到最终的聚类中心,根据聚类中心初始化CHMM模型的参数。实验结果表明,与随机取值算法相比,该算法提高了语音的识别率。
关键词:
语音识别,
连续型隐马尔可夫模型,
K-means算法,
局部最优,
参数初
Abstract: The method of Continuous Hidden Markov Model(CHMM) parameter initialization for speech recognition is segmented with K-means algorithm that can lead to convergence in local optimization of model parameters.A new approach of CHMM parameters initialization is proposed based on density and distance.Computing density and distance of data,the initial cluster center is selected according to the far distance and max density,then carries the K-means clustering process to get the final cluster centers,and initializes the CHMM parameters according to the cluster center.Experimental results show that the new approach has better recognition results compared with random selection algorithm.
Key words:
speech recognition,
Continuous Hidden Markov Model(CHMM),
K-means algorithm,
local optimization,
parameter initialization
中图分类号:
鲜晓东,吕建中,樊宇星. 基于密度与距离参数的CHMM声学模型初值估计[J]. 计算机工程.
XIAN Xiaodong,LV Jianzhong,FAN Yuxing. Initial Estimation of CHMM Acoustic Model Based on Density and Distance Parameter[J]. Computer Engineering.