作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2020, Vol. 46 ›› Issue (4): 316-320. doi: 10.19678/j.issn.1000-3428.0054556

• 开发研究与工程应用 • 上一篇    

基于准循环神经网络的语音增强方法

娄迎曦, 袁文浩, 彭荣群   

  1. 山东理工大学 计算机科学与技术学院, 山东 淄博 255000
  • 收稿日期:2019-04-10 修回日期:2019-05-13 出版日期:2020-04-15 发布日期:2019-05-24
  • 作者简介:娄迎曦(1996-),女,硕士研究生,主研方向为语音增强;袁文浩,讲师;彭荣群,副教授。
  • 基金资助:
    国家自然科学基金(61701286,11704229);山东省自然科学基金(ZR2015FL003,ZR2017MF047,ZR2017LA011)。

Speech Enhancement Method Based on Quasi Recurrent Neural Network

LOU Yingxi, YUAN Wenhao, PENG Rongqun   

  1. School of Computer Science and Technology, Shandong University of Technology, Zibo, Shandong 255000, China
  • Received:2019-04-10 Revised:2019-05-13 Online:2020-04-15 Published:2019-05-24

摘要: 在基于深度学习的语音增强模型中,长短时记忆网络能较好地解决序列语音增强问题,但该模型在处理大规模含噪语音数据时存在训练速度缓慢的问题。为此,提出一种基于准循环神经网络的语音增强方法。利用门函数和记忆单元确保含噪语音序列上下文的相关性,门函数的计算不再依赖上一时刻的输出,且该模型在含噪语音序列的输入和门函数的计算中都引入矩阵的卷积运算,使模型可以同时处理多个时刻的语音序列信息,从而增强模型并行计算的能力。实验结果表明,与长短时记忆网络相比,该方法能在保证语音增强性能的前提下,有效提高网络模型的训练速度。

关键词: 语音增强, 准循环神经网络, 长短时记忆网络, 神经网络, 卷积运算

Abstract: In the deep learning based speech enhancement model,the Long Short-Term Memory Network(LSTM) can well handle the sequence speech enhancement problem,but the training speed of the model is slow when dealing with speech enhancement problems based on large-scale noisy speech data.Aiming at this problem,this paper proposes a speech enhancement method based on quasi Recurrent Neural Network(RNN).The gate functions and memory cells are used to ensure the correlation of the context of noisy speech sequences,and the calculation of gate functions is no longer dependent on the output of the previous moment.Moreover,the model introduces the convolution operation of the matrix in the calculation of the input of the noisy speech sequence and gate functions,so that the model can simultaneously process the speech sequence information at multiple moments,thereby enhancing the parallel computational ability.Experimental results show that compared with the LSTM,the proposed method can greatly improve the training speed of the network model under the premise of ensuring the speech enhancement performance.

Key words: speech enhancement, Quasi Recurrent Neural Network(QRNN), Long Short-Term Memory Network(LSTM), neural network, convolutional operation

中图分类号: