[1] XU Z W,CHI X B,XIAO N.High-performance computing environment:a review of twenty years of experiments in China[J].National Science Review,2016,3(1):36-48. [2] WANG H Q,PENG S L,ZHU X Q,et al.A method to accelerate GROMACS in offload mode on Tianhe-2 supercomputer[C]//Proceedings of the 15th IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing.Washington D.C.,USA:IEEE Press,2015:781-784. [3] PIÑEIRO C,PICHEL J C.A unified framework to improve the interoperability between HPC and Big Data languages and programming models[J].Future Generation Computer Systems,2022,134:123-139. [4] YIN F,SHI F.A comparative survey of big data computing and HPC:from a parallel programming model to a cluster architecture[J].International Journal of Parallel Programming,2022,50(1):27-64. [5] HEINECKE A,BREUER A,RETTENBERGER S,et al.Petascale high order dynamic rupture earthquake simulations on heterogeneous supercomputers[C]//Proceedings of International Conference for High Performance Computing,Networking,Storage and Analysis.Washington D.C.,USA:IEEE Press,2014:3-14. [6] YAN D,WANG W,CHU X.An LLVM-based open-source compiler for NVIDIA GPUs[C]//Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.New York,USA:ACM Press,2022:448-449. [7] SHOBAKI G,KERBOW A,PULIDO C,et al.Exploring an alternative cost function for combinatorial register-pressure-aware instruction scheduling[J].ACM Transactions on Architecture and Code Optimization,2019,16(1):1-30. [8] CHEN S M,WANG Y H,LIU S,et al.FT-Matrix:a coordination-aware architecture for signal processing[J].IEEE Micro,2014,34(6):64-73. [9] 荀长庆,陈照云,文梅,等.以编译为导向的Matrix-DSP程序分析与优化[J].计算机工程与科学,2020,42(10):1791-1800. XUN C Q,CHEN Z Y,WEN M,et al.Compilation-oriented code analysis and optimization for Matrix-DSP[J].Computer Engineering & Science,2020,42(10):1791-1800.(in Chinese) [10] PANDEY M,SARDA S.LLVM cookbook[M].[S.l.]:Packt,2015:296. [11] LOZANO R C,CARLSSON M,DREJHAMMAR F,et al.Constraint-based register allocation and instruction scheduling[C]//Proceedings of International Conference on Principles and Practice of Constraint Programming.Berlin,Germany:Springer,2012:750-766. [12] SHOBAKI G,GORDON V S,MCHUGH P,et al.Register-pressure-aware instruction scheduling using ant colony optimization[J].ACM Transactions on Architecture and Code Optimization,19(2):23. [13] DORIGO M,MANIEZZO V,COLORNI A.Ant system:optimization by a colony of cooperating agents[J].IEEE Transactions on Systems,Man,and Cybernetics,Part B:Cybernetics,1996,26(1):29-41. [14] 刘胜,卢凯,郭阳,等.一种自主设计的面向E级高性能计算的异构融合加速器[J].计算机研究与发展,2021,58(6):1234-1237. LIU S,LU K,GUO Y,et al.A self-designed heterogeneous accelerator for exascale high performance computing[J].Journal of Computer Research and Development,2021,58(6):1234-1237.(in Chinese) [15] GIESEMANN F,GERLACH L,PAYÁ-VAYÁ G.Evolutionary algorithms for instruction scheduling,operation merging,and register allocation in VLIW compilers[J].Journal of Signal Processing Systems,2020,92(7):655-678. [16] LOZANO R C,CARLSSON M,BLINDELL G H,et al.Combinatorial register allocation and instruction scheduling[J].ACM Transactions on Programming Languages and Systems,41(3):17. [17] MALEKI S,GAO Y Q,GARZAR'N M J,et al.An evaluation of vectorizing compilers[C]//Proceedings of International Conference on Parallel Architectures and Compilation Techniques.Washington D.C.,USA:IEEE Press,2011:372-382. [18] 李嘉楠,韩林,柴赟达.面向国产平台的LLVM自动向量化移植与优化[J].计算机工程,2022,48(1):142-148. LI J N,HAN L,CHAI Y D.Automatic vectorization transplant and optimization of LLVM for domestic processors[J].Computer Engineering,2022,48(1):142-148.(in Chinese) [19] 冯竞舸,贺也平,陶秋铭.自动向量化:近期进展与展望[J].通信学报,2022,43(3):180-195. FENG J G,HE Y P,TAO Q M.Auto-vectorization:recent development and prospect[J].Journal on Communications,2022,43(3):180-195.(in Chinese) [20] MAMMADLI R,JANNESARI A,WOLF F.Static neural compiler optimization via deep reinforcement learning[C]//Proceedings of 2020 IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC(LLVM-HPC) and Workshop on Hierarchical Parallelism for Exascale Computing(HiPar).Washington D.C.,USA:IEEE Press,2020:1-10. [21] WU L,PEI J,TANG J,et al.Deep learning on graphs:methods and applications[C]//Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.New York,USA:ACM Press,2022:4906-4907. [22] WU Z H,PAN S R,CHEN F W,et al.A comprehensive survey on graph neural networks[J].IEEE Transactions on Neural Networks and Learning Systems,2021,32(1):4-24. [23] FEY M,LENSSEN J E.Fast graph representation learning with PyTorch geometric[EB/OL].[2023-01-02].http://arxiv.org/pdf/1903.02428. [24] WANG M J,YU L F,ZHENG D,et al.Deep graph library:towards efficient and scalable deep learning on graphs[EB/OL].[2023-01-02].http://arxiv.org/abs/1909.01315v1. [25] 池昊宇,陈长波.基于机器学习的编译器自动调优综述[J].计算机科学,2022,49(1):241-251. CHI H Y,CHEN C B.Survey on automatic tuning of compilers by machine learning[J].Computer Science,2022,49(1):241-251.(in Chinese) |