摘要: Mel倒谱系数(MFCC)侧重提取语音信号的低频信息,对语音信号的频谱分布特性描述不充分,不能有效区分说话人个性信息。为此,通过分析语音信号各频段所含说话人个性信息的不同,结合Mel滤波器和反Mel滤波器在高低频段的不同特性,提出一种适于说话人识别的改进Mel滤波器。实验结果表明,改进Mel滤波器提取的新特征能够获得比传统Mel倒谱系数以及反Mel倒谱系数(IMFCC)更好的识别效果,并且基本不增加说话人识别系统训练和识别的时间开销。
关键词:
说话人识别,
Mel倒谱系数,
个性信息,
反Mel倒谱系数,
频谱分布,
语音信号
Abstract: Mel-frequency Cepstral Coefficient(MFCC) focuses on extracting information in the lower frequency of speech signal, and fails to describe the distribution of a speech spectrum sufficiently, so it cannot effectively distinguish speaker’s specific information. By analyzing the distribution of speaker specific information in different frequency bands of the speech signal, different characters of mel-filterbank and inverted mel-filterbank are combined in high and low frequency bands, and an improved filterbank is presented, which is more suitable for speaker recognition. Experimental results show that features are extracted using the improved filterbank achieve better recognition rates compared with the traditional MFCC and Inverted MFCC, and without increasing the computing time obviously.
Key words:
speaker recognition,
Mel-frequency Cepstral Coefficient(MFCC),
specific information,
Inverted Mel-frequency Cepstral Coefficient(MFCC),
spectrum distribution,
speech signal
中图分类号:
项要杰,杨俊安,李晋徽,陆俊. 一种适用于说话人识别的改进Mel滤波器[J]. 计算机工程.
XIANG Yao-jie, YANG Jun-an, LI Jin-hui, LU Jun. An Improved Mel-frequency Filter for Speaker Recognition[J]. Computer Engineering.