[1]过敏意.大模型时代网络基础设施的机遇与挑战[J].计算机研究与发展, 2024, 61(11):2663.DOI:10.7544/issn1000-1239.ps20241101.
Guo M. Opportunities and Challenges of Network Infrastructure in the Era of Large Language Models [J]. Journal of Computer Research and Development, 2024, 61(11):2663.DOI:10.7544/issn1000-1239.ps20241101.
[2]李纯羽,邓龙,李永坤,等.面向远程内存图数据库的应用感知分离式存储设计[J].计算机科学, 2025(1).DOI:10.11896/jsjkx.231200073.
Li Chunyu, Deng L, Li K, et al. Application-aware Disaggregated Storage Design for Remote Memory Graph Database[J]. Computer Science, 2025(1). DOI:10.11896/jsjkx.231200073.
[3]Hoffmann J, Borgeaud S, Mensch A, et al. Training compute-optimal large language models[J]. arxiv preprint arxiv:2203.15556, 2022.
[4]Touvron H, Martin L, Stone K, et al. Llama 2: Open foundation and fine-tuned chat models[J]. arxiv preprint arxiv. 2307.09288, 2023.
[5]Zhao W, Wu J, Lu W, et al. TianMen: a DPU-based storage network offloading structure for disaggregated datacenters[C]//Proceedings of the 2024 ACM Symposium on Cloud Computing. 2024: 689-703.
[6]Liu M, Ene T D, Kirby R, et al. Chipnemo: Domain-adapted llms for chip design[J]. arXiv preprint arXiv:2311.00176, 2023.
[7]Kong X, Chen J, Bai W, et al. Understanding {RDMA} microarchitecture resources for performance isolation[C]//20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). 2023: 31-48.
[8]Li A, Song S L, Chen J, et al. Evaluating modern gpu interconnect: Pcie, nvlink, nv-sli, nvswitch and gpudirect[J]. IEEE Transactions on Parallel and Distributed Systems, 2019, 31(1): 94-110.
[9]Weng Q, Xiao W, Yu Y, et al. MLaaS in the wild: Workload analysis and scheduling in Large-Scale heterogeneous GPU clusters[C]//19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 2022: 945-960.
[10]NVIDIA技术服务(北京)有限公司.数据处理器:DPU编程入门[M].北京:机械工业出版社,2023
NVIDIA Technology Service (Beijing) Co., Ltd. Data Processing Unit: An Introduction to DPU Programming [M]. Beijing: China Machine Press, 2023.
[11]中科驭数(北京)科技有限公司.开物数据网络开发平台[EB/OL] [2024-12-31] https://www.yusur.tech/product/DNDP.
YUSUR Technology Co., Ltd. KaiWu Data Network Development Platform. [EB/OL] [2024-12-31] https://www.yusur.tech/product/DNDP.
[12]Karamati S, Hughes C, Hemmert K S, et al. “Smarter” NICs for faster molecular dynamics: a case study[C]//2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2022: 583-594.
[13]Barsellotti L, Alhamed F, Olmos J J V, et al. Introducing data processing units (DPU) at the Edge[C]//2022 International Conference on Computer Communications and Networks (ICCCN). IEEE, 2022: 1-6.
[14]Bayatpour M, Sarkauskas N, Subramoni H, et al. Bluesmpi: Efficient mpi non-blocking alltoall offloading designs on modern bluefield smart nics[C]//International Conference on High Performance Computing. Cham: Springer International Publishing, 2021: 18-37.
[15]Sarkauskas N, Bayatpour M, Tran T, et al. Large-message nonblocking mpi_iallgather and mpi ibcast offload via bluefield-2 dpu[C]//2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC). IEEE, 2021: 388-393.
[16]Suresh K K, Michalowicz B, Ramesh B, et al. A Novel Framework for Efficient Offloading of Communication Operations to Bluefield SmartNICs[C]//2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2023: 123-133.
[17]Liu J, Lin T, Zhang Y, et al. Energy-Constrained Partial Offloading in Data Processing Unit (DPU)-Enabled Mobile Edge Computing[C]//2022 IEEE Smartworld, Ubiquitous Intelligence & Computing, Scalable Computing & Communications, Digital Twin, Privacy Computing, Metaverse, Autonomous & Trusted Vehicles (SmartWorld/UIC/ScalCom/DigitalTwin/PriComp/Meta). IEEE, 2022: 664-671.
[18]Njavro A, Tau J, Groves T, et al. A DPU Solution for Container Overlay Networks[J]. arxiv preprint arxiv:2211.10495, 2022.
[19]Gootzen P J, Pfefferle J, Stoica R, et al. DPFS: DPU-Powered File System Virtualization[C]//Proceedings of the 16th ACM International Conference on Systems and Storage. 2023: 1-7.
[20]Michalowicz B, Suresh K K, Subramoni H, et al. Battle of the BlueFields: An In-Depth Comparison of the BlueField-2 and BlueField-3 SmartNICs[C]//2023 IEEE Symposium on High-Performance Interconnects (HOTI). IEEE, 2023: 41-48.
[21]McCalpin J D. Memory bandwidth and machine balance in current high performance computers[J]. IEEE computer society technical committee on computer architecture (TCCA) newsletter, 1995, 2(19-25).
[22]Michalowicz B, Kandadi Suresh K, Subramoni H, et al. DPU-Bench: A Micro-Benchmark Suite to Measure Offload Efficiency Of SmartNICs[M]//Practice and Experience in Advanced Research Computing. 2023: 94-101.
[23]Wang Z, Wang C, Wang L. DPUBench: An application-driven scalable benchmark suite for comprehensive DPU evaluation[J]. BenchCouncil Transactions on Benchmarks, Standards and Evaluations, 2023, 3(2): 100120.
[24]Miao R, Zhu L, Ma S, et al. From luna to solar: the evolutions of the compute-to-storage networks in Alibaba cloud[C]//Proceedings of the ACM SIGCOMM 2022 Conference (SIGCOMM '22). Association for Computing Machinery, New York, NY, USA, 753–766.
[25]Zhang Y, Li G, Wang J, et al., DoW-KV: A DPU-offloaded and Write-optimized Key-Value Store on Disaggregated Persistent Memory[C]//2023 IEEE International Conference on Cluster Computing (CLUSTER), Santa Fe, NM, USA, 2023, pp. 271-283.
[26]Liao Y, Wu J, Lu W, et al. DPU-Direct: Unleashing Remote Accelerators via Enhanced RDMA for Disaggregated Datacenters[J]. IEEE Transactions on Computers, 2024.
[27]Kang N, Wang Z, Yang F, et al. csRNA: Connection-Scalable RDMA NIC Architecture in Datacenter Environment[C]//2022 IEEE 40th International Conference on Computer Design (ICCD). IEEE, 2022: 398-406.
[28]Ma X, Yang F, Wang Z, et al. A Scalable RDMA Network Interface Card with Efficient Cache Management[C]//2023 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2023: 1-5.
[29]Wang X, Chen G, Yin X, et al. StaR: Breaking the scalability limit for RDMA[C]//2021 IEEE 29th International Conference on Network Protocols (ICNP). IEEE, 2021: 1-11.
[30]Wang Z, Luo L, Ning Q, et al. SRNIC: A Scalable Architecture for RDMA NICs[C]//20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). 2023: 1-14.
[31]Chen Y, Lu Y, Shu J. Scalable RDMA RPC on reliable connection with efficient resource sharing[C]//Proceedings of the Fourteenth EuroSys Conference 2019. 2019: 1-14.
[32]Wang Z, Huang H, Zhang J, et al. {FpgaNIC}: An {FPGA-based} versatile 100gb {SmartNIC} for {GPUs}[C]//2022 USENIX Annual Technical Conference (USENIX ATC 22). 2022: 967-986.
|