[1] 骆涛.面向大数据处理的并行计算模型及性能优化[D].合肥:中国科学技术大学, 2015. LUO T.Parallel computational model and performance optimization on big data[D].Hefei:University of Science and Technology of China, 2015.(in Chinese) [2] TOP 10 Sites for November 2016[EB/OL].[2021-09-01].https://www.top500.org/lists/top500/2016/11/. [3] TOP 10 Sites for November 2021[EB/OL].[2021-09-01].https://www.top500.org/lists/top500/2021/11/. [4] 张明.龙芯平台上高性能计算的性能优化关键问题研究[D].合肥:中国科学技术大学, 2017. ZHANG M.Research on key issues of performance optimization in high performance computing based on the godson[D].Hefei:University of Science and Technology of China, 2017.(in Chinese) [5] WULF W A, MCKEE S A.Hitting the memory wall[J].ACM SIGARCH Computer Architecture News, 1995, 23(1):20-24. [6] NOWATZYK A, PONG F, SAULSBURY A.Missing the memory wall:the case for processor/memory integration[J].ACM Sigarch Computer Architecture News, 1996, 24(2):90-101. [7] SMITH A J.Cache memories[J].ACM Computing Surveys, 1982, 14(3):473-530. [8] LI P.Analysis and development of the locality principle[J].Advances in Intelligent and Soft Computing, 2012, 133(7):211-214. [9] 刘扬, 安虹, 邓博斌, 等.程序局部性的量化分析[J].计算机工程, 2013, 39(1):67-70, 75. LIU Y, AN H, DENG B B, et al.Quantization analysis of program locality[J].Computer Engineering, 2013, 39(1):67-70, 75.(in Chinese) [10] CONWAY P, KALYANASUNDHARAM N, DONLEY G, et al.Cache hierarchy and memory subsystem of the AMD opteron processor[J].IEEE Micro, 2010, 30(2):16-29. [11] 田新华, 欧国东, 张民选.基于修正LRU的压缩Cache替换策略[J].计算机工程, 2008, 34(18):7-9, 16. TIAN X H, OU G D, ZHANG M X.Replacement policy for compressed cache based on modified LRU[J].Computer Engineering, 2008, 34(18):7-9, 16.(in Chinese) [12] JALEEL A, THEOBALD K B, STEELY S C J, et al.High performance cache replacement using re-reference interval prediction[J].ACM SIGARCH Computer Architecture News, 2010, 38(3):60-71. [13] QURESHI M K, JALEEL A, PATT Y N, et al.Set-dueling-controlled adaptive insertion for high-performance caching[J].IEEE Micro, 2008, 28(1):91-98. [14] CHANG J C, SOHI G S.Cooperative cache partitioning for chip multiprocessors[C]//Proceedings of the 25th Anniversary ACM International Conference on Supercomputing.New York, USA:ACM Press, 2014:402-412. [15] BAROKAR G.Efficient methods of cooperative cache rartitioning[M].[S.1.]:LAP Lambert Academic Publishing, 2015. [16] KURIAN G, KHAN O, DEVADAS S.The locality-aware adaptive cache coherence protocol[J].ACM SIGARCH Computer Architecture News, 2013, 41(3):523-534. [17] MITTAL S.A survey of techniques for architecting TLBs[J].IEEE Transactions on Parallel & Distributed Systems, 2017, 29(10):40-61. [18] KANDIRAJU G B, SIVASUBRAMANIAM A.Going the distance for TLB prefetching:an application-driven study[C]//Proceedings of the 29th Annual International Symposium on Computer Architecture.Washington D.C., USA:IEEE Press, 2002:195-206. [19] Intel Corporation.TLBs, paging-structure cache and their invalidation[EB/OL].[2021-09-01].http://www.intel.com/support/processors/sb/cs-009861.htm. [20] EBNER D, BRANDNER F, SCHOLZ B, et al.Generalized instruction selection using SSA-graphs[J].ACM SIGPLAN Notices, 2008, 43(7):31-40. [21] 何军, 张晓东, 郭勇.一种TLB结构优化方法[J].计算机工程, 2012, 38(21):253-256. HE J, ZHANG X D, GUO Y.An optimization method of TLB architecture[J].Computer Engineering, 2012, 38(21):253-256.(in Chinese) [22] LOZANO R C, CARLSSON M, BLINDELL G H, et al.Combinatorial register allocation and instruction scheduling[J].ACM Transactions on Programming Languages and Systems, 2019, 41(3):1-53. [23] 廉玉龙, 史峥, 李春强, 等.基于C-SKY CPU的地址立即数编译优化方法[J].计算机工程, 2016, 42(1):46-50. LIAN Y L, SHI Z, LI C Q, et al.Compiling optimization method of address immediate value based on C-SKY CPU[J].Computer Engineering, 2016, 42(1):46-50.(in Chinese) [24] 董钰山, 李春江, 徐颖.GCC编译器中循环数组预取优化的实现及效果[J].计算机工程与应用, 2016, 52(6):19-25. DONG Y S, LI C J, XU Y.Implementation and effects of loop-array-prefetching optimization in GCC[J].Computer Engineering and Applications, 2016, 52(6):19-25.(in Chinese) [25] 王亚刚.深入分析GCC[M].北京:机械工业出版社, 2017. WANG Y G.In depth analysis of GCC[M].Beijing:China Machine Press, 2017.(in Chinese) [26] HENNING J L.SPEC CPU2006 benchmark descriptions[J].ACM SIGARCH Computer Architecture News, 2006, 34(4):1-17. |