作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2019, Vol. 45 ›› Issue (2): 250-257. doi: 10.19678/j.issn.1000-3428.0049473

• 多媒体技术及应用 • 上一篇    下一篇

基于多特征融合与动态阈值的语音端点检测方法

朱春利,李昕   

  1. 上海大学 机电工程与自动化学院,上海 200072
  • 收稿日期:2017-11-28 出版日期:2019-02-15 发布日期:2019-02-15
  • 作者简介:朱春利(1994—),男,硕士,主研方向为语音信号处理;李昕,副研究员、博士。
  • 基金资助:

    上海市科委重点项目(14DZ1206302)

Speech Endpoint Detection Method Based on Multi-Feature

ZHU Chunli,LI Xin   

  1. School of Mechatronics Engineering and Automation,Shanghai University,Shanghai 200072,China
  • Received:2017-11-28 Online:2019-02-15 Published:2019-02-15

摘要:

在低信噪比及非平稳的噪声环境下,传统基于特征的语音端点检测方法检测正确率低、稳定性差。为此,提出一种新的语音端点检测方法。通过对含噪语音进行谱减法降噪,提取谱减后的语音信号与前导无话帧的MFCC倒谱距离特征,计算均匀子带频带方差特征,并对阈值进行动态更新,利用双参数双门限法对带噪语音进行端点判定。实验结果表明,与基于DWT-MFCC倒谱距离、基于谱减法和均匀子带频带方差的端点检测方法相比,该方法具有较高的检测正确率及较低的漏检率与误检率。

关键词: 端点检测, 谱减, MFCC倒谱距离, 均匀子带方差, 动态阈值更新

Abstract:

In view of low signal-to-noise ratio and non-stationary noise environment,the traditional methods based on feature detection have the low accuracies and poor stabilities.To solve this problem,this paper proposes a new speech endpoint detection method.The spectral subtraction method is used to reduce noise.Then the MFCC cepstrum distance features of the speech signal after spectral subtraction and the leading silent frame are extracted,and also the frequency variance characteristics of the uniform sub-band are extracted.And the dynamic threshold updating mechanism is used to detect the noisy speech with two-parameter double threshold method.Experimental results show that,compared with the method based on DWT-MFCC cepstrum distance and the method based on spectral subtraction and uniform sub-band variance,the proposed method has a higher accuracy and the lower miss rate and error rate.

Key words: endpoint detection, spectral subtraction, MFCC cepstrum distance, uniform sub-band variance, dynamic threshold update

中图分类号: