[1] |
HINTON G,DENG Li,YU Dong,et al.Deep neural networks for acoustic modeling in speech recognition:the shared views of four research groups[J].IEEE Signal Processing Magazine,2012,29(6):82-97.
|
[2] |
李伟林,文剑,马文凯.基于深度神经网络的语音识别系统研究[J].计算机科学,2016,43(11A):45-49.
|
[3] |
GRAVES A,JAITLY N.Towards end-to-end speech recognition with recurrent neural networks[EB/OL].[2018-03-09].http://proceedings.mlr.press/v32/graves14.pdf.
|
[4] |
HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
|
[5] |
GRAVES A,FERNNDEZ S,GOMEZ F,et al.Connectionist temporal classification:labelling unsegmented sequence data with recurrent neural networks[C]//Proceedings of the 23rd International Conference on Machine Learning.New York,USA:ACM Press,2006:369-376.
|
[6] |
LI Jie,ZHANG Heng,CAI Xinyuan,et al.Towards end-to-end speech recognition for Chinese mandarin using long short-term memory recurrent neural networks[C]//Proceedings of the 16th Annual Conference of International Speech Com-munication Association.Baixas,France:International Speech Communication Association,2015:1-5.
|
[7] |
SAK H,SENIOR A,BEAUFAYS F.Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition[EB/OL].[2018-03-09].https://arxiv.org/pdf/1402.1128v1.pdf.
|
[8] |
YU Dong,LI Jinyu.Recent progresses in deep learning based acoustic models[J].IEEE/CAA Journal of Automatica Sinica,2017,4(3):396-409.
|
[9] |
MIAO Yajie,GOWAYYED M,METZE F.EESEN:end-to-end speech recognition using deep RNN models and WFS T-based decoding[C]//Proceedings of 2015 IEEE Workshop on Automatic Speech Recognition and Understanding.Washington D.C.,USA:IEEE Press,2016:167-174.
|
[10] |
黎长江,胡燕.基于循环神经网络的音素识别研究[J].微电子学与计算机,2017,34(8):47-51.
|
[11] |
ZENKEL T,SANABRIA R,METZE F,et al.Comparison of decoding strategies for CTC acoustic models[EB/OL].[2018-03-09].https://arxiv.org/pdf/1708.04469.pdf.
|
[12] |
WANG Dong,ZHANG Xuewei.Thchs-30:A free chinese speech corpus[EB/OL].[2018-03-09].https://arxiv.org/pdf/1512.01882.pdf.
|
[13] |
PUNDAK G,SAINATH T N.Lower frame rate neural network acoustic models[EB/OL].[2018-03-09].https://storage.googleapis.com/pub-tools-public-publication-data/pdf/45555.pdf.
|
[14] |
MIAO Yajie,GOWAYYED M,NA Xingyu,et al.An empirical exploration of CTC acoustic models[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2016:2623-2627.
|
[15] |
GHAHREMANI P,BABAALI B,POVEY D,et al.A pitch extraction algorithm tuned for automatic speech recognition[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2014:2494-2498.
|
[16] |
秦楚雄,张连海.低资源语音识别中融合多流特征的卷积神经网络声学建模方法[J].计算机应用,2016,36(9):2609-2615.
|
[17] |
BILLA J.Improving LSTM-CTC based ASR performance in domains with limited training data[EB/OL].[2018-03-09].https://arxiv.org/pdf/1707.00722.pdf.
|
[18] |
KINGSBURY B.Lattice-based optimization of sequence classi-fication criteria for neural-network acoustic modeling[C]//Proceedings of International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2009:3761-3764.
|