[1] 余凯, 贾磊, 陈雨强, 等.深度学习的昨天、今天和明天[J].计算机研究与发展, 2013, 50(9):1799-1804. YU K, JIA L, CHEN Y Q, et al.Deep learning:yesterday, today, and tomorrow[J].Journal of Computer Research and Development, 2013, 50(9):1799-1804.(in Chinese) [2] LECUN Y, BENGIO Y, HINTON G.Deep learning[J].Nature, 2015, 521(7553):436-444. [3] HASSABALLAH M, AWAD A I.Deep learning in computer vision:principles and applications[M].[S.l.]:CRC Press, 2020. [4] OTTER D W, MEDINA J R, KALITA J K.A survey of the usages of deep learning for natural language processing[J].IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(2):604-624. [5] GOODFELLOW I, BENGIO Y, COURVILLE A.Deep learning[M].Cambridge, USA:MIT Press, 2016. [6] ABADI M, BARHAM P, CHEN J M, et al.TensorFlow:a system for large-scale machine learning[C]//Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation.San Diego, USA:USENIX Association, 2016:265-283. [7] PASZKE A, GROSS S, MASSA F, et al.PyTorch:an imperative style, high-performance deep learning library[EB/OL].[2022-02-05].https://arxiv.org/abs/1912.01703. [8] Intel.oneDNN documentation[EB/OL].[2022-02-05].https://oneapi-src.github.io/oneDNN/. [9] CHETLUR S, WOOLLEY C, VANDERMERSCH P, et al.cuDNN:efficient primitives for deep learning[EB/OL].[2022-02-05].https://arxiv.org/abs/1410.0759. [10] KHAN J, FULTZ P, TAMAZOV A, et al.MIOpen:an open source library for deep learning primitives[EB/OL].[2022-02-05].https://arxiv.org/abs/1910.00078. [11] Intel.clDNN documentation[EB/OL].[2022-02-05].https://intel.github.io/clDNN/index.html. [12] MUNSHI A.The OpenCL specification[C]//Proceedings of IEEE Hot Chips 21 Symposium.Washington D.C., USA:IEEE Press, 2016:1-314. [13] DAEYEON K.OpenDNN:an open-source, cuDNN-like deep learning primitive library[EB/OL].[2022-02-05].https://s-space.snu.ac.kr/bitstream/10371/150799/1/000000154337.pdf. [14] STONE J E, GOHARA D, SHI G C.OpenCL:a parallel programming standard for heterogeneous computing systems[J].Computing in Science & Engineering, 2010, 12(3):66-72. [15] SZEGEDY C, TOSHEV A, ERHAN D.Deep neural networks for object detection[EB/OL].[2022-02-05].https://www.semanticscholar.org/paper/Deep-Neural-Net works-for-Object-Detection-Szegedy-Toshev/713f73ce5c3013d9fb796c21b981dc6629af0bd5. [16] LARABEL M.Google looks to open up StreamExecutor to make GPGPU programming easier[EB/OL].[2022-02-25].https://www.phoronix.com/scan.php?page=news_item&px=Google-StreamExec-Parallel. [17] HABIBZADEH M, SHISHVAN O R, SOYATA T.CUDA libraries[M]//SOYATA T.GPU parallel program development using CUDA.Berlin, Germany:Springer, 2018:383-395. [18] SZEGEDY C, IOFFE S, VANHOUCKE V, et al.Inception-v4, inception-ResNet and the impact of residual connections on learning[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence.New York, USA:ACM Press, 2017:4278-4284. [19] RONNEBERGER O, FISCHER P, BROX T.U-net:convolutional networks for biomedical image segmentation[C]//Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention.Berlin, Germany:Springer, 2015:234-241. [20] DEVLIN J, CHANG M W, LEE K, et al.BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL].[2022-02-05].https://arxiv.org/abs/1810.04805. [21] MARTIN C.Hacking MIOpen to run anywhere and libre ML-lightingtalk@COSCUP 2017 postscript[EB/OL].[2022-02-05].https://mightynotes.wordpress.com/2017/08/13/hacking-miopen-to-run-anywhere-and-libre-ml-light ingtalkcoscup-2017-postscript/. [22] Advanced Micro Devices, Inc.Welcome to AMD ROCm™ platform[EB/OL].[2022-02-05].https://rocmdocs.amd.com/en/latest/index.html. [23] CHEN T Q, LI M, LI Y T, et al.MXNet:a flexible and efficient machine learning library for heterogeneous distributed systems[EB/OL].[2022-02-05].https://arxiv.org/abs/1512.01274. [24] MAUDOUX G, MENS K.Correct, efficient, and tailored:the future of build systems[J].IEEE Software, 2018, 35(2):32-37. [25] NUGTEREN C.CLBlast:a tuned OpenCL BLAS library[C]//Proceedings of International Workshop on OpenCL.New York, USA:ACM Press, 2018:1-10. [26] Advanced Micro Devices, Inc.clBLAS library user documenta-tion[EB/OL].[2022-02-05].https://github.com/clMath Libraries/clBLAS. [27] Python Software Foundation.unittest introduction[EB/OL].[2022-02-05].https://docs.python.org/zh-cn/3.7/library/unittest.html. [28] SENGUPTA A, YE Y T, WANG R, et al.Going deeper in spiking neural networks:VGG and residual architectures[J].Frontiers in Neuroscience, 2019, 13:95. [29] KRIZHEVSKY A.Learning multiple layers of features from tiny images[EB/OL].[2022-02-05].http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf. [30] SANDERS J, KANDROT E.An introduction to general-purpose GPU programming[M].[S.l.]:Addison-Wesley, 2010. [31] JÄÄSKELÄINEN P, DE LA LAMA C S, SCHNETTER E, et al.pocl:a performance-portable OpenCL implementation[J].International Journal of Parallel Programming, 2015, 43(5):752-785. |