Optimization of Molecular Dynamics Algorithm for Solid Crystalline Silicon Based on GPU

doi:10.19678/j.issn.1000-3428.0063895

Abstract

Abstract: Molecular Dynamics(MD) simulations are typically used to investigate the thermodynamic properties of crystalline silicon.Molecular simulations generally require heavy computational loads owing to the complex multibody interaction potential between atoms, resulting in limited time and space scale of calculations.The Graphics Processing Unit (GPU), which adopts parallel multithreading technology and computationally intensive processing, shows significant application potential in MD simulations.Therefore, it is necessary to fully use the characteristics of GPU hardware architecture to improve the space-time scale of MD simulations of solid covalent crystalline silicon to investigate the thermal conduction mechanism of crystalline silicon.Based on the simulation algorithm of solid covalent crystalline silicon MD, a fixed neighbor algorithm design and optimization for the GPU computing platform is proposed.The data structure, branch structure optimization, and other methods are used to solve the time consuming problem of global memory access and branch structure of the fixed neighbor algorithm for MD simulations, reduce data memory access consumption and branch conflict, and change the thread parallel scheduling mode to achieve high performance parallel computing on the GPU computing platform.This effectively solves the computing load problem.The experimental results show that acceleration ratio of LAMMPS double precision solid crystalline silicon MD simulation and double precision fixed neighbor algorithm is 11.62, and the acceleration ratio of HOOMD-blue double precision solid crystal silicon molecular dynamics simulation, double precision fixed neighbor algorithm and single precision fixed neighbor algorithm is 9.39 and 12.18 respectively.

Key words: Molecular Dynamics(MD) simulation, Graphics Processing Unit(GPU), fixed neighbor, data structure, branch structure

摘要： 分子动力学模拟通常用于晶体硅热力学性质的研究，因原子间采用复杂的多体作用势，分子模拟通常面临较高的计算负载，导致计算的时间和空间尺度受限。图形处理器（GPU）采用并行多线程技术，用于计算密集型处理任务，在分子动力学模拟领域中显示巨大的应用潜力。因此，充分利用GPU硬件架构特性提升固态共价晶体硅分子动力学模拟的时空尺度对晶体硅导热机制的研究具有重要意义。基于固态共价晶体硅分子动力学模拟算法，提出面向GPU计算平台的固定邻居算法设计与优化。利用数据结构、分支结构优化等方法解决分子动力学模拟的固定邻居算法全局访存和分支结构的耗时问题，降低数据访存消耗和分支冲突，通过改变线程并行调度方式，在GPU计算平台上实现高性能并行计算，有效解决计算负载问题。实验结果表明，LAMMPS双精度固态晶体硅分子动力学模拟与双精度固定邻居算法的加速比为11.62，HOOMD-blue双精度固态晶体硅分子动力学模拟与双精度固定邻居算法和单精度固定邻居算法的加速比分别为9.39和12.18。

关键词: 分子动力学模拟, 图形处理器, 固定邻居, 数据结构, 分支结构

CLC Number:

TP391.9

LI Jing, ZHU Aiqi, HAN Lin, HOU Chaofeng. Optimization of Molecular Dynamics Algorithm for Solid Crystalline Silicon Based on GPU[J]. Computer Engineering, 2023, 49(3): 288-295.

李靖, 祝爱琦, 韩林, 侯超峰. 基于GPU的固态晶体硅分子动力学算法优化[J]. 计算机工程, 2023, 49(3): 288-295.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0063895

http://www.ecice06.com/EN/Y2023/V49/I3/288

Figures/Tables 12

References

[1] 张帅, 徐顺, 刘倩, 等.基于GPU的分子动力学模拟Cell Verlet算法实现及其并行性能分析[J].计算机科学, 2018, 45(10):291-294, 299. ZHANG S, XU S, LIU Q, et al.Cell verlet algorithm of molecular dynamics simulation based on GPU and its parallel performance analysis[J].Computer Science, 2018, 45(10):291-294, 299.(in Chinese)
[2] 陈捷捷, 王彦浩, 刘丹.CUDA架构下分子动力学模拟的高速实现[J].机械, 2013, 40(12):73-76. CHEN J J, WANG Y H, LIU D.High speed molecular dynamics simulation approach based on CUDA[J].Machinery, 2013, 40(12):73-76.(in Chinese)
[3] 文玉华, 朱如曾, 周富信, 等.分子动力学模拟的主要技术[J].力学进展, 2003, 33(1):65-73. WEN Y H, ZHU R Z, ZHOU F X, et al.An overview on molecular dynamics simulation[J].Advances in Mechanics, 2003, 33(1):65-73.(in Chinese)
[4] HOWARD M P.Efficient mesoscale hydrodynamics:multiparticle collision dynamics with massively parallel GPU acceleration[J].Computer Physics Communications, 2018, 230:10-20.
[5] FAN Z Y.Efficient molecular dynamics simulations with many-body potentials on graphics processing units[J].Computer Physics Communications, 2017, 218:10-16.
[6] BROWN W M.Implementing molecular dynamics on hybrid high performance computers-three-body potentials[J].Computer Physics Communications, 2013, 184(12):2785-2793.
[7] HOU C F.Efficient GPU-accelerated molecular dynamics simulation of solid covalent crystals[J].Computer Physics Communications, 2013, 184(5):1364-1371.
[8] LUKE D, OLIVIER G, MARK H, et al.Inside volta:the world's most advanced data center GPU[EB/OL].[2022-01-09].https://developer.nvidia.com/blog/inside-volta.
[9] XU J H, FU H H, LUK W, et al.Optimizing finite volume method solvers on NVIDIA GPUs[J].IEEE Transactions on Parallel and Distributed Systems, 2019, 30(12):2790-2805.
[10] YAN M Y, CHEN Z D, DENG L, et al.Characterizing and understanding GCNs on GPU[J].IEEE Computer Architecture Letters, 2020, 19(1):22-25.
[11] 田盼, 华蓓, 陆李.基于GPU的K-近邻算法实现[J].计算机工程, 2015, 41(2):189-192, 198. TIAN P, HUA B, LU L.Implementation of K-nearest neighbor algorithm based on GPU[J].Computer Engineering, 2015, 41(2):189-192, 198.(in Chinese)
[12] VLACHAKIS D, BENCUROVA E, PAPANGELOPOULOS N, et al.Current state-of-the-art molecular dynamics methods and applications[J].Advances in Protein Chemistry and Structural Biology, 2014, 94:269-313.
[13] MINKIN A S.GPU implementations of some many-body potentials for molecular dynamics simulations[J].Advances in Engineering Software, 2017, 111:43-51.
[14] TERSOFF J.Empirical interatomic potential for silicon with improved elastic properties[J].Physical Review B, Condensed Matter, 1988, 38(14):9902-9905.
[15] HOU C F, GE W.GPU-accelerated molecular dynamics simulation of solid covalent crystals[J].Molecular Simulation, 2012, 38(1):8-15.
[16] 吴恩华.图形处理器用于通用计算的技术、现状及其挑战[J].软件学报, 2004, 15(10):1493-1504. WU E H.State of the art and future challenge on general purpose computation by graphics processing unit[J].Journal of Software, 2004, 15(10):1493-1504.(in Chinese)
[17] MAHMOUD A, HARI S K S, SULLIVAN M B, et al.Optimizing software-directed instruction replication for GPU error detection[C]//Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis.Washington D.C., USA:IEEE Press, 2018:842-854.
[18] 于齐, 王博千, 沈立, 等.GPU平台上面向性能和功耗的分支优化[J].计算机科学, 2016, 43(5):22-26. YU Q, WANG B Q, SHEN L, et al.Branch divergence optimization for performance and power consumption on GPU platform[J].Computer Science, 2016, 43(5):22-26.(in Chinese)
[19] LIN H X, WANG C L.On-GPU thread-data remapping for nested branch divergence[J].Journal of Parallel and Distributed Computing, 2020, 139:75-86.
[20] TU S Z, ZHAO M S, HU X Q, et al.Accelerating the discontinuous Galerkin cell-vertex scheme solver on GPU-powered systems[J].International Journal of Computational Science and Engineering, 2019, 20(2):209.
[21] TROTT C, BERGER-VERGIAT L, POLIAKOFF D, et al.The KOKKOS ecosystem:comprehensive performance portability for high performance computing[J].Computing in Science & Engineering, 2021, 23(5):10-18.
[22] GLASER J.Strong scaling of general-purpose molecular dynamics simulations on GPUs[J].Computer Physics Communications, 2015, 192:97-107.
[23] KONDRATYUK N, NIKOLSKIY V, PAVLOV D, et al.GPU-accelerated molecular dynamics:state-of-art software performance and porting from NVIDIA CUDA to AMD HIP[J].The International Journal of High Performance Computing Applications, 2021, 35(4):312-324.
[24] YANG L, ZHANG F, WANG C Z, et al.Implementation of metal-friendly EAM/FS-type semi-empirical potentials in HOOMD-blue:a GPU-accelerated molecular dynamics software[J].Journal of Computational Physics, 2018, 359:352-360.
[25] ANDERSON J A, GLASER J, GLOTZER S C, et al.HOOMD-blue:a Python package for high-performance molecular dynamics and hard particle Monte Carlo simulations[J].Computational Materials Science, 2020, 173:1-10.

Please choose a citation manager

Content to export