计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

一种适用于说话人识别的改进Mel滤波器

项要杰1,2,杨俊安1,2,李晋徽1,2,陆 俊1,2   

  1. (1. 电子工程学院信息系,合肥 230037;2. 安徽省电子制约技术重点实验室,合肥 230037)
  • 收稿日期:2012-09-05 出版日期:2013-11-15 发布日期:2013-11-13
  • 作者简介:项要杰(1987-),男,硕士研究生,主研方向:语音识别;杨俊安,教授、博士生导师;李晋徽、陆 俊,硕士研究生
  • 基金项目:
    国家自然科学基金资助项目(60872113)

An Improved Mel-frequency Filter for Speaker Recognition

XIANG Yao-jie  1,2, YANG Jun-an   1,2, LI Jin-hui   1,2, LU Jun   1,2   

  1. (1. Department of Information, Electronic Engineering Institute, Hefei 230037, China; 2. Anhui Province Key Laboratory of Electronic Restriction Technology, Hefei 230037, China)
  • Received:2012-09-05 Online:2013-11-15 Published:2013-11-13

摘要: Mel倒谱系数(MFCC)侧重提取语音信号的低频信息,对语音信号的频谱分布特性描述不充分,不能有效区分说话人个性信息。为此,通过分析语音信号各频段所含说话人个性信息的不同,结合Mel滤波器和反Mel滤波器在高低频段的不同特性,提出一种适于说话人识别的改进Mel滤波器。实验结果表明,改进Mel滤波器提取的新特征能够获得比传统Mel倒谱系数以及反Mel倒谱系数(IMFCC)更好的识别效果,并且基本不增加说话人识别系统训练和识别的时间开销。

关键词: 说话人识别, Mel倒谱系数, 个性信息, 反Mel倒谱系数, 频谱分布, 语音信号

Abstract: Mel-frequency Cepstral Coefficient(MFCC) focuses on extracting information in the lower frequency of speech signal, and fails to describe the distribution of a speech spectrum sufficiently, so it cannot effectively distinguish speaker’s specific information. By analyzing the distribution of speaker specific information in different frequency bands of the speech signal, different characters of mel-filterbank and inverted mel-filterbank are combined in high and low frequency bands, and an improved filterbank is presented, which is more suitable for speaker recognition. Experimental results show that features are extracted using the improved filterbank achieve better recognition rates compared with the traditional MFCC and Inverted MFCC, and without increasing the computing time obviously.

Key words: speaker recognition, Mel-frequency Cepstral Coefficient(MFCC), specific information, Inverted Mel-frequency Cepstral Coefficient(MFCC), spectrum distribution, speech signal

中图分类号: