计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

基于申威26010处理器的扩展函数库实现与优化

曹代,郭绍忠,张辛   

  1. (数学工程与先进计算国家重点实验室,郑州 450002)
  • 收稿日期:2016-01-07 出版日期:2017-01-15 发布日期:2017-01-13
  • 作者简介:曹代(1990—),男,硕士研究生,主研方向为高性能计算;郭绍忠,教授;张辛,硕士研究生。
  • 基金项目:
    国家“863”计划项目(2009AA012201)。

Implementation and Optimization of Extended Function Library Based on SW26010 Processor

CAO Dai,GUO Shaozhong,ZHANG Xin   

  1. (State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450002,China)
  • Received:2016-01-07 Online:2017-01-15 Published:2017-01-13

摘要: Intel,AMD和IBM都具有针对自身特点的向量扩展库。相比于传统的标量计算,向量化技术带来的加速比较高。为此,针对申威26010处理器开发向量数学库软件。在分析函数常用级数法和迭代法算法的基础上,结合三角函数、反三角函数、指数函数和对数函数研究一种高效向量化算法,并对其进行实现与优化,使其支持函数高精度和高性能计算,并且满足浮点运算的要求。测试结果表明,该算法精度达到申威26010处理器上特定应用的要求,与Intel VML数学库相比,各函数的平均加速比均达到1.1以上。

关键词: 浮点运算, 数学函数, 申威26010处理器, 数据分段, 指令调度

Abstract: Intel,AMD and IBM have their vector extension libraries which accord with their own features.Compared with traditional scalar calculation,the speedup of vectorization technology is higher.Therefore,this paper develops a set of vector math library software for SW26010 processor.Based on the analysis of function commonly used,like series method and iterative algorithm,combined with the trigonometric function,inverse trigonometric function,exponential function and logarithmic function,it researches an efficient vectorization algorithm and carries out realization and optimization.This algorithm supports high precision and high performance calculation,and meets the requirements of floating-point calculation.Test result shows that the precision of the proposed function library satisfies specific application requirements of SW26010.Compared with the Intel VML math library,all functions’ performance improvements are more than 1.1 on average.

Key words: floating-point calculation, mathematical function, SW26010 processor, data segmentation, instruction scheduling

中图分类号: