[1] SCHULLER B,STEIDL S,BATLINER A.The INTERSPEECH 2009 emotion challenge[C]//Proceedings of the 10th Annual Conference of the International Speech Communication Association.Brighton,UK:[s.n.],2009:312-315. [2] SCHULLER B,STEIDL S,BATLINER A,et al.The INTERSPEECH 2013 computational paralinguistics challenge:social signals,conflict,emotion,autism[C]//Proceedings of the 14th Annual Conference of the International Speech Communication Association.Lyon,France:[s.n.],2013:148-512. [3] RINGEVAL F,SCHULLER B,VALSTAR M,et al.AVEC 2015:the first affect recognition challenge bridging across audio,video,and physiological data[C]//Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge.New York,USA:ACM Press,2015:3-8. [4] CHENCHAH F,LACHIRI Z.Acoustic emotion recognition using linear and nonlinear cepstral coefficients[J].International Journal of Advanced Computer Science and Applications,2015,6(11):1-10. [5] SATT A,ROZENBERG S,HOORY R.Efficient emotion recognition from speech using deep learning on spectrograms[C]//Proceedings of the 18th Annual Conference of the International Speech Communication Association.Stockholm,Sweden:[s.n.],2017:1089-1093. [6] ZHANG Y Y,DU J,WANG Z R,et al.Attention based fully convolutional network for speech emotion recognition[C]//Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.Washington D.C.,USA:IEEE Press,2019:1771-1775. [7] HUANG Z W,DONG M,MAO Q R,et al.Speech emotion recognition using CNN[C]//Proceedings of the 22nd ACM International Conference on Multimedia.New York,USA:ACM Press,2014:801-804. [8] XU Y F,XU H,ZOU J Y.HGFM:a hierarchical grained and feature model for acoustic emotion recognition[C]//Proceedings of 2020 IEEE International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2020:6499-6503. [9] DAI D Y,WU Z Y,LI R N,et al.Learning discriminative features from spectrograms using center loss for speech emotion recognition[C]//Proceedings of 2019 IEEE International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2019:7405-7409. [10] SCHMIDT E M,KIM Y E.Learning emotion-based acoustic features with deep belief networks[C]//Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.Washington D.C.,USA:IEEE Press,2011:65-68. [11] HAN K,YU D,TASHEV I.Speech emotion recognition using deep neural network and extreme learning machine[C]//Proceedings of the 15th Annual Conference of the International Speech Communication Association.Singapore:[s.n.],2014:223-227. [12] MIRSAMADI S,BARSOUM E,ZHANG C.Automatic speech emotion recognition using recurrent neural networks with local attention[C]//Proceedings of IEEE International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2017:2227-2231. [13] CHEN M Y,HE X J,YANG J,et al.3-D convolutional recurrent neural networks with attention model for speech emotion recognition[J].IEEE Signal Processing Letters,2018,25(10):1440-1444. [14] BUSSO C,BULUT M,LEE C C,et al.IEMOCAP:interactive emotional dyadic motion capture database[J].Language Resources and Evaluation,2008,42(4):335-359. [15] BURKHARDT F,PAESCHKE A,ROLFES M,et al.A database of German emotional speech[C]//Proceedings of INTERSPEECH 2005.[S.l.]:ISCA,2005:1517-1520. [16] LI P,SONG Y,LAN M,et al.An attention pooling based representation learning method for speech emotion recognition[C]//Proceedings of the 19th Annual Conference of the International Speech Communication Association.Hyderabad,India:[s.n.],2018:3087-3091. [17] YENIGALLA P,KUMAR A,TRIPATHI S.Speech emotion recognition using spectrogram & phoneme embedding[C]//Proceedings of the 19th Annual Conference of the International Speech Communication Association.Hyderabad,India:[s.n.],2018:3688-3692. [18] ZHAO J,MAO X,CHEN L.Speech emotion recognition using deep 1D & 2D CNN LSTM networks[J].Biomedical Signal Processing and Control,2019,47(4):312-323. [19] McFEE B,RAFFEL C,LIANG D,et al.Librosa:audio and music signal analysis in Python[C]//Proceedings of the 14th Python in Science Conference.Austin,USA:SciPy,2015:18-25. [20] LI R N,WU Z Y,JIA J,et al.Dilated residual network with multi-head self-attention for speech emotion recognition[C]//Proceedings of 2019 IEEE International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2019:6675-6679. [21] ZHAO H,XIAO Y,HAN J,et al.Compact convolutional recurrent neural networks via binarization for speech emotion recognition[C]//Proceeding of 2019 IEEE International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2019:6690-6694. [22] ZHONG Y,HU Y,HUANG H,et al.A lightweight model based on separable convolution for speech emotion recognition[C]//Proceedings of the 21st Annual Conference of the International Speech Communication Assoiation.Shanghai,China:[s.n.],2020:3331-3335. [23] LATINUS M,TAYLOR M J.Discriminating male and female voices:differentiating pitch and gender[J].Brain Topography,2012,25(2):194-204. [24] XU Z,MEYER P,FINGSCHEDT T.On the effects of speaker gender in emotion recognition training data[C]//Proceedings of ITG Conference 2018.Dresden,Germany:ITG-Fachbericht,2018:61-64. [25] VOGT T,ANDRE E.Improving automatic emotion recognition from speech via gender differentiation[C]//Proceedings of Language Resources and Evaluation Conference.Genoa,Italy:LREC,2006:1123-1126. [26] NEUMANN M,VU N T.Improving speech emotion recognition with unsupervised representation learning on unlabeled speech[C]//Proceedings of 2019 IEEE International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2019:7390-7394. |