[1] FU H H, LIAO J F, YANG J Z, et al.The Sunway TaihuLight supercomputer:system and applications[J].Science China Information Sciences, 2016, 59(7):1-16. [2] 胡向东, 柯希明, 尹飞, 等.高性能众核处理器申威26010[J].计算机研究与发展, 2021, 58(6):1155-1165. HU X D, KE X M, YIN F, et al.Shenwei-26010:a high-performance many-core processor[J].Journal of Computer Research and Development, 2021, 58(6):1155-1165.(in Chinese) [3] Fluent.Fluent 6.2 user's guide[EB/OL].[2022-07-05].https://www.cfd-online.com/Forums/fluent/36245-fluent-6-2users-guide.html. [4] FRINK N T.Tetrahedral unstructured navier-stokes method for turbulent flows[J].AIAA Journal, 1998, 36(11):1975-1982. [5] FRINK N T.Upwind scheme for solving the Euler equations on unstructured tetrahedral meshes[J].AIAA Journal, 1992, 30(1):70-77. [6] ANDERSON W K, BONHAUS D L.An implicit upwind algorithm for computing turbulent flows on unstructured grids[J].Computers & Fluids, 1994, 23(1):1-21. [7] NIELSEN E J.Aerodynamic design sensitivities on an unstructured mesh using the Navier-Stokes equations and a discrete adjoint formulation[EB/OL].[2022-07-05].https://theses.lib.vt.edu/theses/available/etd-110498-110349/unrestricted/thesis.pdf. [8] GERHOLD T, FRIEDRICH O, EVANS J, et al.Calculation of complex three-dimensional configurations employing the DLR-tau-code[J].AIAA Journal, 1997, 16(1):67-81. [9] ANGELINI R C, SAHU J.Visualization techniques of a CFD++ data set of a spinning smart munition[EB/OL].[2022-07-05].https://apps.dtic.mil/sti/pdfs/ADA428396.pdf. [10] MAVRIPLIS D J.Third drag prediction workshop results using the NSU3D unstructured mesh solver[J].Journal of Aircraft, 2008, 45(3):750-761. [11] JASAK H, JEMCOV A, TUKOVIC Z.OpenFOAM:a C++ library for complex physics simulations[EB/OL].[2022-07-05].https://www.researchgate.net/publication/228879492_OpenFOAM_A_C_library_for_complex_physics_simulations. [12] TUREK S, BECKER C.Featflow-finite element software for the incompressible Navier-Stokes equations[EB/OL].[2022-07-05].https://www.semanticscholar.org/paper/FEATFLOW-Finite-element-software-for-the-equations-Turek-Becker/90aff87e5bec2b1e3ad3d3356a1da617a3e28059. [13] POPINET S.Gerris:a tree-based adaptive solver for the incompressible Euler equations in complex geometries[J].Journal of Computational Physics, 2003, 190(2):572-600. [14] FRÉDÉRIC A, NAMANE M, MARC S.Code saturne:a finite volume code for the computation of turbulent incompressible flows-industrial applications[J].International Journal on Finite Volumes, 2004, 1(1):1-62. [15] BOLZ J, FARMER I, GRINSPUN E, et al.Sparse matrix solvers on the GPU[J].ACM Transactions on Graphics, 2003, 22(3):917-924. [16] BELL N, GARLAND M.Implementing sparse matrix-vector multiplication on throughput-oriented processors[C]//Proceedings of Conference for High Performance Computing Networking, Storage and Analysis.Washington D.C., USA:IEEE Press, 2009:1-11. [17] VÁZQUEZ F, FERNÁNDEZ J J, GARZÓN E M.A new approach for sparse matrix vector product on NVIDIA GPUs[J].Concurrency and Computation:Practice and Experience, 2011, 23(8):815-826. [18] MONAKOV A, LOKHMOTOV A, AVETISYAN A.Automatically tuning sparse matrix-vector multiplication for GPU architectures[C]//Proceedings of International Conference on High-Performance Embedded Architectures and Compilers.Berlin, Germany:Springer, 2010:111-125. [19] CHOI J W, SINGH A, VUDUC R W.Model-driven autotuning of sparse matrix-vector multiply on GPUs[J].ACM SIGPLAN Notices, 2010, 45(5):115-126. [20] KOZA Z, MATYKA M, SZKODA S, et al.Compressed multirow storage format for sparse matrices on graphics processing units[J].SIAM Journal on Scientific Computing, 2014, 36(2):219-239. [21] GREATHOUSE J L, DAGA M.Efficient sparse matrix-vector multiplication on GPUs using the CSR storage format[C]//Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis.Washington D.C., USA:IEEE Press, 2014:769-780. [22] ASHARI A, SEDAGHATI N, EISENLOHR J, et al.Fast sparse matrix-vector multiplication on GPUs for graph applications[C]//Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis.Washington D.C., USA:IEEE Press, 2014:781-792. [23] MERRILL D, GARLAND M.Merge-based parallel sparse matrix-vector multiplication[C]//Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis.Washington D.C., USA:IEEE Press, 2016:1-12. [24] LIU W F, VINTER B.CSR5:an efficient storage format for cross-platform sparse matrix-vector multiplication[C]//Proceedings of the 29th ACM International Conference on Supercomputing.New York, USA:ACM Press, 2015:339-350. [25] BULUÇ A, FINEMAN J T, FRIGO M, et al.Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks[C]//Proceedings of the 21st Annual Symposium on Parallelism in Algorithms and Architectures.New York, USA:ACM Press, 2009:233-244. [26] ASHARI A, SEDAGHATI N, EISENLOHR J, et al.An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs[C]//Proceedings of the 28th ACM International Conference on Supercomputing.New York, USA:ACM Press, 2014:15-26. [27] LIANG Y, TANG W T, ZHAO R Z, et al.Scale-free sparse matrix-vector multiplication on many-core architectures[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2017, 36(12):2106-2119. [28] YAN S, LI C, ZHANG Y, et al.yaSpMV:yet another SpMV framework on GPUs[J].ACM SIGPLAN Notices, 2014, 49(8):107-118. [29] 刘芳芳, 杨超, 袁欣辉, 等.面向国产申威26010众核处理器的SpMV实现与优化[J].软件学报, 2018, 29(12):3921-3932. LIU F F, YANG C, YUAN X H, et al.General SpMV implementation in many-core domestic Sunway 26010 processor[J].Journal of Software, 2018, 29(12):3921-3932.(in Chinese) [30] LIU C X, XIE B W, LIU X, et al.Towards efficient SpMV on Sunway manycore architectures[C]//Proceedings of 2018 International Conference on Supercomputing.Washington D.C., USA:IEEE Press, 2018:363-373. [31] 倪鸿, 刘鑫.基于神威·太湖之光的非结构网格众核优化技术[J].计算机工程, 2019, 45(6):45-51. NI H, LIU X.Multi-core optimization technology of unstructured grid based on Sunway TaihuLight[J].Computer Engineering, 2019, 45(6):45-51.(in Chinese) [32] 倪鸿, 刘鑫.非结构网格下稀疏下三角方程求解器众核优化技术研究[J].计算机科学, 2019, 46(S1):518-522. NI H, LIU X.Many-core optimization for sparse triangular solver under unstructured grids[J].Computer Science, 2019, 46(S1):518-522.(in Chinese) [33] CHEN Y D, XIAO G Q, WU F, et al.tpSpMV:a two-phase large-scale sparse matrix-vector multiplication kernel for manycore architectures[J].Information Sciences, 2020, 523:279-295. |