[1] MCCULLOCH W S, PITTS W.A logical calculus of the ideas immanent in nervous activity[J].Bulletin of Mathematical Biology, 1990, 52(1/2):99-115. [2] HINTON G E, SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science, 2006, 313(5786):504-507. [3] ROSENBLATT F.The perceptron:a probabilistic model for information storage and organization in the brain[J].Psychological Review, 1958, 65(6):386-408. [4] LECUN Y, BOTTOU L, BENGIO Y, et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE, 1998, 86(11):2278-2324. [5] 司念文, 张文林, 屈丹, 等.卷积神经网络表征可视化研究综述[J/OL].自动化学报:1-31[2021-02-12].https://doi:10.16383/j.aas.c200554. SI N W, ZHANG W L, QU D, et al.A review on representation visualization of convolutional neural networks[J].ActaAutomatica Sinica:1-31[2021-02-12].https://doi:10.16383/j.aas.c200554.(in Chinese) [6] RUMELHART D E, HINTON G E, WILLIAMS R J.Learning representations by back-propagating errors[J].Nature, 1986, 323(6088):533-536. [7] 刘晴.一种改进的深度卷积神经网络及其权值初始化方法研究[D].保定:河北大学, 2018. LIU Q.An improved deep convolutional neural network and its weight initialization[D].Baoding:Hebei University, 2018.(in Chinese). [8] 沈成恺.卷积神经网络权值初始化方法研究[D].北京:北京工业大学, 2017. SHEN C K.Research on initialization method of convolutional neural networks[D].Beijing:Beijing University of Technology, 2017.(in Chinese) [9] BURKARDT J.The truncated normal distribution[EB/OL].[2021-06-01].https://www.doc88.com/p-1176985733398.html. [10] 李玉鑑, 沈成恺, 杨红丽, 等.初始化卷积神经网络的主成分洗牌方法[J].北京工业大学学报, 2017, 43(1):22-27. LI Y J, SHEN C K, YANG H L, et al.PCA shuffling initialization of convolutional neural networks[J].Journal of Beijing University of Technology, 2017, 43(1):22-27.(in Chinese) [11] SHEN H.Towards a mathematical understanding of the difficulty in learning with feedforward neural networks[EB/OL].[2021-06-01].https://arxiv.org/abs/1611. 05827. [12] HE K M, ZHANG X Y, REN S Q, et al.Delving deep into rectifiers:surpassing human-level performance on ImageNet classification[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:1026-1034. [13] 张焕, 张庆, 于纪言.激活函数的发展综述及其性质分析[J].西华大学学报(自然科学版), 2021, 40(4):1-10. ZHANG H, ZHANG Q, YU J Y.A review of the development and property analysis of activation function[J].Journal of XihuaUniversity(Natural Science Edition), 2021, 40(4):1-10.(in Chinese) [14] 李杰.卷积神经网络的权重初始化研究及应用[D].青岛:青岛大学, 2020. LI J.Research and application of weight initialization of convolutional neural networks[D].Qingdao:Qingdao University, 2020.(in Chinese). [15] HAN X, ZHANG Z Y, DING N, et al.Pre-trained models:past, present and future[J].AI Open, 2021, 2:225-250. [16] KETKAR N S.Introduction to PyTorch[M].Germany, Germany:Springer, 2017. [17] HAN J, MORAGA C.The influence of the sigmoid function parameters on the speed of backpropagation learning[C]//Proceedings of IEEE International Workshop on Artificial Neural Networks.Washington D.C., USA:IEEE Press, 1995:195-201. [18] DAHL G E, SAINATH T N, HINTON G E.Improving deep neural networks for LVCSR using rectified linear units and dropout[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing.Washington D.C., USA:IEEE Press, 2013:8609-8613. [19] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM, 2017, 60(6):84-90. [20] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:770-778. [21] BARABASI A L, ALBERT R.Emergence of scaling in random networks[J].Science, 1999, 286(5439):509-512. [22] KANG G L, DONG X Y, ZHENG L, et al.PatchShuffle regularization[EB/OL].[2021-06-01].https://arxiv.org/abs/1707.07103. [23] MCMAHAN H B, HOLT G, SCULLEY D, et al.Ad click prediction:a view from the trenches[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York, USA:ACM Press, 2013:1222-1230. |