[1] LI M,ANDERSEN D G,PARK J W,et al.Scaling distributed machine learning with the parameter server[C]//Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation.[S.l.]:USENIX,2014:583-598. [2] Baidu.Baidu-allreduce[EB/OL].[2019-02-10].https://github.com/baidu-research/baidu-allreduce. [3] RECHT B,RE C,WRIGHT S,et al.Hogwild:a lock-free approach to parallelizing stochastic gradient descent[C]//Proceedings of NIPS'11.Berlin,Germany:Springer,2011:693-701. [4] HO Q,CIPAR J,CUI H,et al.More effective distributed ML via a stale synchronous parallel parameter server[C]//Proceedings of NJPS'13.Berlin,Germany:2013:1223-1231. [5] HPC milestone:the IBM POWER8 server is connected to the Tesla P100 via NVLINK[J].Intelligent Manufacturing,2016(9):49.(in Chinese) HPC里程碑:IBM POWER8服务器通过NVLINK与Tesla P100互联[J].智能制造,2016(9):49. [6] WEI Xingda,CHEN Rong,CHEN Haibo.Optimizing distributed systems with remote direct memory access[J].Big Data Research,2018,4(4):3-14.(in Chinese)魏星达,陈榕,陈海波.基于RDMA高速网络的高性能分布式系统[J].大数据,2018,4(4):3-14. [7] CHAHAL K,GROVER M S,DEY K.A hitchhiker's guide on distributed training of deep neural networks[EB/OL].[2019-02-10].https://arxiv.org/pdf/1810.11787.pdf. [8] ABADI M,BARHAM P,CHEN J,et al.Tensorflow:a system for large-scale machine learning[C]//Proceedings of the 12th Symposium on Operating Systems Design and Implementation.[S.l.]:USENIX,2016:265-283. [9] PASZKE A,GROSS S,CHINTALA S,et al.Pytorch:tensors and dynamic neural networks in python with strong GPU acceleration[EB/OL].[2019-02-10].https://github.com/t-vi/pytorch. [10] CHEN Tianqi,LI Mu,LI Yutian,et al.MXNet:a flexible and efficient machine learning library for heterogeneous distributed systems[EB/OL].[2019-02-10].https://arxiv.org/pdf/1512.01274.pdf. [11] ZHANG Haoshenglun,LI Chong,KE Yong,et al.A distributed user browse click model algorithm[J].Computer Engineering,2019,45(3):1-6.(in Chinese)张浩盛伦,李翀,柯勇,等.一种分布式用户浏览点击模型算法[J].计算机工程,2019,45(3):1-6. [12] HE X N,CHUA T S.Neural factorization machines for sparse predictive analytics[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.New York,USA:ACM Press,2017:355-364. [13] CHENG H T,KOC L,HARMSEN J,et al.Wide and deep learning for recommender systems[C]//Proceedings of the 1st Workshop on Deep Learning for Recommender Systems.New York,USA:ACM Press,2016:7-10. [14] KINGMA D P,WELLING M.Auto-encoding variationalBayes[EB/OL].[2019-02-10].https://arxiv.org/pdf/1312.6114.pdf. [15] DEAN J,GHEMAWAT S.MapReduce:simplified data processing on large clusters[J].Communications of the ACM,2008,51(1):107-113. [16] KINGMA D P,BA J.Adam:a method for stochastic optimization[EB/OL].[2019-02-10].https://arxiv.org/pdf/1412.6980~8.pdf. [17] McMAHAN H B,HOLT G,SCULLEY D,et al.Ad click prediction:a view from the trenches[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,USA:ACM Press,2013:1222-1230. [18] YOU Y,ZHANG Z,HSIEH C J,et al.Imagenet training in minutes[C]//Proceedings of the 47th International Conference on Parallel Processing.New York,USA:ACM Press,2018:1. [19] ZHENG Shuxin,MENG Qi,WANG Taifeng,et al.Asynchronous stochastic gradient descent with delay compensation[C]//Proceedings of the 34th International Conference on Machine Learning.Sydney,Australia:[s.n.],2017:4120-4129. [20] SERGEEV A,DELBALSO M.Horovod:fast and easy distributed deep learning in TensorFlow[EB/OL].[2019-02-10].https://arxiv.org/pdf/1802.05799.pdf. [21] SONG Kuangshi.LightCTR[EB/OL].[2019-02-10].https://github.com/cnkuangshi/LightCTR. |