[1] 陈龙杰, 张钰, 张玉梅, 等.基于多注意力多尺度特征融合的图像描述生成算法[J].计算机应用, 2019, 39(2):354-359. CHEN L J, ZHANG Y, ZHANG Y M, et al.Image description generation algorithm based on multi-attention and multi-scale feature fusion[J].Journal of Computer Applications, 2019, 39(2):354-359.(in Chinese) [2] 汤鹏杰, 王瀚漓, 许恺晟.LSTM逐层多目标优化及多层概率融合的图像描述[J].自动化学报, 2018, 44(7):1237-1249. TANG P J, WANG H L, XU K S.LSTM layer by layer multi-object optimization and multi-layer probability fusion image description[J].Acta Automatica Sinica, 2018, 44(7):1237-1249.(in Chinese) [3] LECUN Y, BENGIO Y, HINTON G.Deep learning[J].Nature, 2015, 521(7553):436-444. [4] HOPFIELD J J.Neural networks and physical systems with emergent collective computational abilities[J].Proceedings of the National Academy of Sciences of the United States of America, 1982, 79(8):2554-2558. [5] MCNEELY-WHITE D, BEVERIDGE J R, DRAPER B A.Inception and ResNet features are(almost) equivalent[J].Cognitive Systems Research, 2020, 59:312-318. [6] LI S, LI W, COOK C, et al.Independently Recurrent Neural Network(IndRNN):building a longer and deeper RNN[EB/OL].[2020-05-05].https://arxiv.org/pdf/1803.04831v3.pdf. [7] WANG J, LI S, AN Z, et al.Batch-normalized deep neural networks for achieving fast intelligent fault diagnosis of machines[J].Neurocomputing, 2019, 329(15):53-65. [8] KULKARNI G, PREMRAJ V, ORDONEZ V, et al.Baby talk:understanding and generating simple image descriptions[J].IEEE Transactions on Pattern Analysis & Machine Intelligence, 2013, 35(12):2891-2903. [9] FARHADI A, HEJRATI S M M, SADEGHI M A, et al.Every picture tells a story:generating sentences from images[EB/OL].[2020-05-05].http://www.cs.cmu.edu/afs/.cs.cmu.edu/Web/People/afarhadi/papers/sentence.pdf. [10] VINYALS O, TOSHEV A, BENGIO S, et al.Show and tell:a neural image caption generator[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2015:12-15. [11] XU K, BA J, KIROS R, et al.Show, attend and tell:neural image caption generation with visual attention[J].Computer Ence, 2015, 5:2048-2057. [12] LI J, MEI X, PROKHOROV D, et al.Deep neural network for structural prediction and lane detection in traffic scene[J].IEEE Transactions on Neural Networks & Learning Systems, 2017, 28(3):690-703. [13] QU Y, LIN L, SHEN F, et al.Joint hierarchical category structure learning and large-scale image classification[J].IEEE Transactions on Image Processing, 2017, 26(9):4331-4346. [14] SHELHAMER E, LONG J, DARRELL T.Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651. [15] GONG C, TAO D, LIU W, et al.Label propagation via teaching-to-learn and learning-to-teach[J].IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(6):1452-1465. [16] HE K, ZHANG X, REN S, et al.Deep residual learning for image recognition[C]//Proceedings of 2016 International Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:770-778. [17] DOGNIN P L, MELNYK I, MROUEH Y, et al.Adversarial semantic alignment for improved image captions[EB/OL].[2020-05-05].https://arxiv.org/pdf/1805.00063v3.pdf. [18] SANGER T D.Optimal unsupervised learning in a single-layer linear feedforward neural network[J].Neural Networks, 1989, 2(6):459-473. [19] YAO T, PAN Y, LI Y, et al.Incorporating copying mechanism in image captioning for learning novel objects[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:10-20. [20] XU Y, WU B, SHEN F, et al.Exact adversarial attack to image captioning via structured output learning with latent variables[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:142-156. [21] HOCHREITER S, SCHMIDHUBER J.Long short-term memory[J].Neural Computation, 1997, 9(8):1735-1780. |