[1] KANG L, YE P, LI Y, et al.Convolutional neural networks for no-reference image quality assessment[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2014:1733-1740. [2] JIAO L C, ZHAO J, YANG S Y, et al.Deep learning, optimization and recognition[M].Beijing:Tsinghua University Press, 2017. [3] 刘凯, 林基明, 郑霖, 等.基于深度自编码网络的慢速移动目标检测[J].计算机工程, 2018, 44(2):129-134. LIU K, LIN J M, ZHENG L, et al.Slow moving target detection based on deep self-coding network[J].Computer Engineering, 2018, 44(2):129-134.(in Chinese) [4] CHEN L C, PAPANDREOU G, KOKKINOS I, et al.Semantic image segmentation with deep convolutional nets and fully connected CRFs[EB/OL].[2021-08-01].https://arxiv.org/abs/1606.00915v1. [5] SCHROFF F, KALENICHENKO D, PHILBIN J.FaceNet:a unified embedding for face recognition and clustering[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2015:815-823. [6] CHEN X Z, KUNDU K, ZHANG Z Y, et al.Monocular 3D object detection for autonomous driving[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:2147-2156. [7] DEAN J, CORRADO G S, MONGA R, et al.Large scale distributed deep networks[C]//Proceedings of Advances in Neural Information Processing Systems.Cambridge, USA:MIT Press, 2012:1223-1231. [8] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM, 2017, 60(6):84-90. [9] SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].[2021-08-01].https://arxiv.org/abs/1409.1556. [10] ABADI M, AGARWAL A, BARHAM P, et al.TensorFlow:large-scale machine learning on heterogeneous distributed systems[EB/OL].[2021-08-01].https://arxiv.org/abs/1603.04467v2. [11] SERGEEV A, DEL BALSO M.Horovod:fast and easy distributed deep learning in TensorFlow[EB/OL].[2021-08-01].https://arxiv.org/abs/1802.05799. [12] HOFFER E, HUBARA I, SOUDRY D.Train longer, generalize better:closing the generalization gap in large batch training of neural networks[EB/OL].[2021-08-01].https://arxiv.org/abs/1705.08741. [13] SEIDE F, FU H, DROPPO J, et al.On parallelizability of stochastic gradient descent for speech DNNS[C]//Proceedings of 2014 IEEE International Conference on Acoustics, Speech and Signal Processing.Washington D.C., USA:IEEE Press, 2014:235-239. [14] WU Y H, SCHUSTER M, CHEN Z F, et al.Google's neural machine translation system:bridging the gap between human and machine translation[EB/OL].[2021-08-01].https://arxiv.org/abs/1609.08144. [15] SZEGEDY C, VANHOUCKE V, IOFFE S, et al.Rethinking the inception architecture for computer vision[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:2818-2826. [16] BENGIO S, NOROUZI M, STEINER B, et al.Device placement optimization with reinforcement learning:USA, 2018175972A1[P].2018-03-23. [17] MIRHOSEINI A, GOLDIE A, PHAM H, et al.A hierarchical model for device placement[C]//Proceedings of International Conference on Learning Representations.Washington D.C., USA:IEEE Press, 2018:246-258. [18] COATES A, HUVAL B, WANG T, et al.Deep learning with COTS HPC systems[C]//Proceedings of the 30th International Conference on International Conference on Machine Learning.Washington D.C., USA:IEEE Press, 2013:568-577. [19] LI M.Scaling distributed machine learning with the parameter server[C]//Proceedings of 2014 International Conference on Big Data Science and Computing.New York, USA:ACM Press, 2014:264-275. [20] GIBIANSKY A.Bringing HPC techniques to deep learning[EB/OL].[2021-08-01].http://research.baidu.com/bringing-hpc-techniques-deep-learning/. [21] DENG J, DONG W, SOCHER R, et al.ImageNet:a large-scale hierarchical image database[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2009:248-255. [22] LIAN X R, ZHANG C, ZHANG H, et al.Can decentralized algorithms outperform centralized algorithms?A case study for decentralized parallel stochastic gradient descent[EB/OL].[2021-08-01].https://arxiv.org/abs/1705.09056. [23] SHI S H, WANG Q, CHU X W.Performance modeling and evaluation of distributed deep learning frameworks on GPUs[EB/OL].[2021-08-01].https://arxiv.org/abs/1711.05979. [24] NCCL.NVIDIA Collective Communications Library[EB/OL].[2021-08-01].https://developer.nvidia.com/nccl. [25] NVIDIA.Nvidia Cuda C Programming Guide[EB/OL].[2021-08-01].https://zhuanlan.zhihu.com/p/53773183. |