| 1 |
|
| 2 |
KE L, ZHANG X, LEE B, et al. DisaggRec: architecting disaggregated systems for large-scale personalized recommendation[EB/OL]. [2023-12-31]. https://arxiv.org/abs/2212.00939.
|
| 3 |
JIANG W Q, HE Z H, ZHANG S, et al. MicroRec: efficient recommendation inference by hardware and data structure solutions[C]//Proceedings of Conference on Machine Learning and Systems. [S. l. ]: MLSys Committee, 2021: 845-859.
|
| 4 |
JIANG W Q, HE Z H, ZHANG S, et al. FleetRec: large-scale recommendation inference on hybrid GPU-FPGA clusters[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. New York, USA: ACM Press, 2021: 3097-3105.
|
| 5 |
GUPTA U, WU C J, WANG X D, et al. The architectural implications of Facebook's DNN-based personalized recommendation[C]//Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA). Washington D.C., USA: IEEE Press, 2020: 488-501.
|
| 6 |
HAZELWOOD K, BIRD S, BROOKS D, et al. Applied machine learning at Facebook: a datacenter infrastructure perspective[C]//Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA). Washington D.C., USA: IEEE Press, 2018: 620-629.
|
| 7 |
LIU Z R , SONG Q Q , LI L , et al. PME: pruning-based multi-size embedding for recommender systems. Frontiers in Big Data, 2023, 6, 1195742.
doi: 10.3389/fdata.2023.1195742
|
| 8 |
LAI F, ZHANG W, LIU R, et al. AdaEmbed: adaptive embedding for large-scale recommendation models[C]// Proceedings of the Operating Systems Design and Implementation(OSDI'23). Boston, USA: USENIX Association, 2023: 817-831.
|
| 9 |
SHI H M, MUDIGERE D, NAUMOV M, et al. Compositional embeddings using complementary partitions for memory-efficient recommendation systems[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, USA: ACM Press, 2020: 165-175.
|
| 10 |
KE L, GUPTA U, CHO B Y, et al. RecNMP: accelerating personalized recommendation with near-memory processing[C]// Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). Washington D.C., USA: IEEE Press, 2020: 790-803.
|
| 11 |
KE L , ZHANG X , SO J , et al. Near-memory processing in action: accelerating personalized recommendation with AxDIMM. IEEE Micro, 2022, 42 (1): 116- 127.
doi: 10.1109/MM.2021.3097700
|
| 12 |
ASGARI B, HADIDI R, CAO J S, et al. FAFNIR: accelerating sparse gathering by using efficient near-memory intelligent reduction[C]//Proceedings of the IEEE International Symposium on High-Performance Computer Architecture (HPCA). Washington D.C., USA: IEEE Press, 2021: 908-920.
|
| 13 |
ADNAN M , MABOUD Y E , MAHAJAN D , et al. Accelerating recommendation system training by leveraging popular choices. Proceedings of the VLDB Endowment, 2021, 15 (1): 127- 140.
doi: 10.14778/3485450.3485462
|
| 14 |
XIE M H, LU Y Y, LIN J Z, et al. Fleche: an efficient GPU embedding cache for personalized recommendations[C]//Proceedings of the 17th European Conference on Computer Systems. New York, USA: ACM Press, 2022: 402-416.
|
| 15 |
SETHI G, ACUN B, AGARWAL N, et al. RecShard: statistical feature-based memory optimization for industry-scale neural recommendation[C]//Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. New York, USA: ACM Press, 2022: 344-358.
|
| 16 |
|
| 17 |
|
| 18 |
|
| 19 |
|
| 20 |
WEI Y C, LANGER M, YU F, et al. A GPU-specialized inference parameter server for large-scale deep recommendation models[C]//Proceedings of the 16th ACM Conference on Recommender Systems. New York, USA: ACM Press, 2022: 408-419.
|
| 21 |
WANG Z H, WEI Y C, LEE M, et al. Merlin HugeCTR: GPU-accelerated recommender system training and inference[C]//Proceedings of the 16th ACM Conference on Recommender Systems. New York, USA: ACM Press, 2022: 534-537.
|
| 22 |
SHAN Y Z, HUANG Y T, CHEN Y L, et al. LegoOS: a disseminated, distributed OS for hardware resource disaggregation[C]//Proceedings of the USENIX Symposium on Operating Systems Design and Implementation. Washington D.C., USA: USENIX, 2018: 1-10.
|
| 23 |
JEON M, VENKATARAMAN S, PHANISHAYEE A, et al. Analysis of large-scale multi-tenant GPU clusters for DNN training workloads[C]//Proceedings of the USENIX Annual Technical Conference(ATC'19). Washington D.C., USA: USENIX, 2019: 947-960.
|
| 24 |
|
| 25 |
GUO A Q, HAO Y C, WU C S, et al. Software-hardware co-design of heterogeneous SmartNIC system for recommendation models inference and training[C]//Proceedings of the 37th International Conference on Supercomputing. New York, USA: ACM Press, 2023: 336-347.
|
| 26 |
HILDEBRAND M. Efficient large scale DLRM implementation on heterogeneous memory systems [D]. Berkeley, USA: University of California, 2023.
|
| 27 |
ARDESTANI E K, KIM C, LEE S J, et al. Supporting massive DLRM inference through software defined memory[C]//Proceedings of the IEEE 42nd International Conference on Distributed Computing Systems (ICDCS). Washington D.C., USA: IEEE Press, 2022: 302-312.
|
| 28 |
WANG S H, MENG Z L, SUN C, et al. SmartChain: enabling high-performance service chain partition between SmartNIC and CPU[C]//Proceedings of the 2020 IEEE International Conference on Communications (ICC). Washington D.C., USA: IEEE Press, 2020: 1-7.
|
| 29 |
|
| 30 |
GROVES T, BROCK B, CHEN Y X, et al. Performance trade-offs in GPU communication: a study of host and device-initiated approaches[C]// Proceedings of the IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS). Washington D.C., USA: IEEE Press, 2020: 126-137.
|
| 31 |
|
| 32 |
ZHU Y, HE Z H, JIANG W Q, et al. Distributed recommendation inference on FPGA clusters[C]// Proceedings of the 31st International Conference on Field-Programmable Logic and Applications (FPL). Washington D.C., USA: IEEE Press, 2021: 279-285.
|
| 33 |
胡琪, 朱定局, 吴惠粦, 等. 智能推荐系统研究综述. 计算机系统应用, 2022, 31 (4): 47- 58.
|
|
HU Q , ZHU D J , WU H L , et al. Survey on intelligent recommendation system. Computer Systems and Applications, 2022, 31 (4): 47- 58.
|