作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2020, Vol. 46 ›› Issue (9): 35-43. doi: 10.19678/j.issn.1000-3428.0056967

• 热点与综述 • 上一篇    下一篇

基于神威太湖之光的宇宙学多体模拟

刘旭1,2, 张曦煌1, 刘钊2,3, 吕小敬2, 朱光辉4   

  1. 1. 江南大学 物联网工程学院, 江苏 无锡 214122;
    2. 国家超级计算无锡中心, 江苏 无锡 214072;
    3. 清华大学, 北京 100084;
    4. 无锡航天江南数据系统科技有限公司, 江苏 无锡 214000
  • 收稿日期:2019-12-19 修回日期:2020-02-28 发布日期:2020-03-06
  • 作者简介:刘旭(1995-),男,硕士研究生,主研方向为高性能计算、并行编译及优化;张曦煌,教授、博士;刘钊,工程师、博士研究生;吕小敬、朱光辉,工程师、硕士。
  • 基金资助:
    国家自然科学基金(51877115)。

Cosmological Multi-Body Simulation Based on Sunway TaihuLight

LIU Xu1,2, ZHANG Xihuang1, LIU Zhao2,3, LÜ Xiaojing2, ZHU Guanghui4   

  1. 1. School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China;
    2. National Supercomputing Center in Wuxi, Wuxi, Jiangsu 214072, China;
    3. Tsinghua University, Beijing 100084, China;
    4. Wuxi Aerospace Jiangnan Data System Technology Co., Ltd., Wuxi, Jiangsu 214000, China
  • Received:2019-12-19 Revised:2020-02-28 Published:2020-03-06

摘要: 宇宙学模拟对于科学家研究非线性结构的形成以及暗物质、暗能量等假想形式具有重要作用,而高精度宇宙学模拟包含数千亿甚至数万亿个粒子,因此超级计算机强大的计算能力使其成为解决宇宙学模拟问题的理想平台。为在国产神威太湖之光超级计算机上实现宇宙学N体模拟,分析PHoToNs软件中使用的粒子网格算法和快速多极子方法,并结合众核处理器架构提出多层次分解和负载均衡方案、执行树遍历和引力计算的流水线策略以及向量化引力计算算法等多种性能优化技术,从而实现能充分发挥神威太湖之光架构优势的N体模拟软件SwPHoToNs。实验结果表明,在神威太湖之光超级计算系统的5 200 000个计算核心上进行包含6 400亿个粒子的宇宙学模拟,SwPHoToNs获得了29.44 PFLOPS的持续计算速度,且并行和计算效率分别为84.6%和48.3%。

关键词: 神威太湖之光, 宇宙学, 多体模拟, 并行优化, 可扩展性

Abstract: Cosmological simulations are essential for scientists to study the formation of non-linear structures and hypotheses of dark matter,dark energy,etc.High-precision cosmological simulations include hundreds of billions or even trillions of particles,thus demanding massive computational power.So supercomputers can provide an ideal platform for cosmological simulation.To implement cosmological N-body simulation on Sunway TaihuLight,a supercomputer developed in China,this paper analyzes the Particle Mesh(PM) and Fast Multipole Method(FMM) in PHoToNs.The analysis results are combined with the multi-core processor structure,and on this basis this paper proposes multiple performance optimization techniques,including a multi-level decomposition and load balancing scheme,a pipeline strategy using execution tree traversal and gravity calculation,and a vectorized gravity calculation algorithm.By using the above techniques,a N-body simulation software,SwPHoToNs,is implemented,which can give full play to the structural advantages of Sunway TaihuLight.Experimental results show that when conducting cosmological simulations which contain up to 640 billion particles on 5 200 000 cores of Sunway TaihuLight,SwPHoToNs obtains a sustained calculation speed of 29.44 PFLOPS with a parallel efficiency of 84.6% and computational efficiency of 48.3%.

Key words: Sunway TaihuLight, cosmology, multi-body simulation, parallel optimization, scalability

中图分类号: