计算机工程 ›› 2014, Vol. 40 ›› Issue (12): 277-281.doi: 10.3969/j.issn.1000-3428.2014.12.052

• 多媒体技术及应用 • 上一篇    下一篇

基于动态特性的D-LTSV语音端点检测方法

赵欢,冯璐,陈佐,张希翔   

  1. 湖南大学信息科学与工程学院,长沙 410082
  • 收稿日期:2013-11-14 修回日期:2014-01-28 出版日期:2014-12-15 发布日期:2015-01-16
  • 作者简介:赵 欢(1967-),女,教授、博士生导师,主研方向:语音信号处理,嵌入式系统设计,嵌入式语音识别;冯 璐,硕士研究生;陈 佐,讲师、博士;张希翔,博士研究生。
  • 基金项目:
    国家自然科学基金资助面上项目(61173106)。

D-LTSV Voice Activity Detection Method Based on Dynamic Feature

ZHAO Huan,FENG Lu,CHEN Zuo,ZHANG Xixiang   

  1. College of Computer Science and Electronic Engineering,Hunan University,Changsha 410082,China
  • Received:2013-11-14 Revised:2014-01-28 Online:2014-12-15 Published:2015-01-16

摘要: 端点检测是语音信号处理的一个关键环节。为提高语音在低性噪比以及非平稳噪声环境下的端点检测性能,在长时信号变化特征(LTSV)的基础上提出一种新的D-LTSV语音端点检测方法。采用Bartlett-Welch方法估计语音谱,分析语音谱在长时域上的熵,利用倒谱的动态特性分析方法提取连续帧熵值的动态变化特征。实验结果表明,D-LTSV综合考虑了语音的非平稳性和帧间非平稳性的动态变化情况,具有比LTSV更好的分辨能力,特别是在低性噪比和非平稳噪声的环境下,D-LTSV的分辨能力提升了50.77%,能够准确地进行端点检测,具有更强的鲁棒性。

关键词: 语音端点检测, 语音谱, 长时特征, 动态特性, 熵, 分辨力

Abstract: Voice Activity Detection(VAD) is a critical step for speech processing.In order to improve the performance of VAD in low Signal-to-noise Ratio(SNR) and nonstationary noise,this paper proposes a novel D-Long-term Signal Variability(LTSV) method based on LTSV for VAD.It uses the Bartlett-Welch method to estimate the signal spectrum,analyzes the entropy on the signal spectrum,and utilizes the analytical method of dynamic features used in the cepstrum to extract dynamic features of the entropy.D-LTSV takes into account the degree of nonstationarity of the signal and the dynamic changes between the frames.Compared with LTSV,experimental result shows that D-LTSV owns more discriminative power which is improved by 50.77 percent in low SNR and nonstationary noise and makes VAD more robust and more accurate.

Key words: Voice Activity Detection(VAD), voice spectrum, long-term feature, dynamic characteristic, entropy, discriminative power

中图分类号: