一种高性能四倍精度浮点乘加器的设计与实现

doi:10.3969/j.issn.1000-3428.2014.02.064

计算机工程

一种高性能四倍精度浮点乘加器的设计与实现

何军，黄永勤，朱英

(上海高性能集成电路设计中心，上海 201204)

收稿日期:2012-12-26 出版日期:2014-02-15 发布日期:2014-02-13
作者简介:何军(1980－)，男，博士研究生，主研方向：微处理器设计；黄永勤、朱英，高级工程师

Design and Implementation of a High Performance Quadruple Precision Floating-point Multiplier Accumulator

HE Jun, HUANG Yong-qin, ZHU Ying

(Shanghai High Performance Integrated Circuit Design Centre, Shanghai 201204, China)

Received:2012-12-26 Online:2014-02-15 Published:2014-02-13

摘要/Abstract

摘要： 高精度、高性能浮点运算部件是高性能微处理器设计的重要部分。通过对传统双精度浮点乘加运算算法的研究，结合四倍精度浮点数据格式特点，设计并实现一种高性能的四倍精度浮点乘加器(QPFMA)，该乘加器支持多种浮点运算，运算延迟为 7拍，全流水结构。采用双路加法器改进算法结构，优化头零预测和规格化移位逻辑，减小运算延迟和硬件开销。通过参数化设计验证方法，实现高效的正确性验证。逻辑综合结果表明，基于65 nm工艺，该QPFMA频率可达1.2 GHz，比现有的QPFMA设计运算延迟减少3拍，频率提高约11.63%。

关键词: 浮点运算, 乘加, 四倍精度, 高精度, 参数化

Abstract: High precision and high performance floating-point unit is an important research object of high performance microprocessor design. According to the characteristic of Quadruple Precision(QP) floating-point data format and research on double precision floating-point multiplier accumulator algorithms, a high performance Quadruple Precision Floating-point Multiplier Accumulator(QPFMA) is designed and realized, which supports multiple floating-point arithmetic with a 7 cycles pipeline. By adopting dual path adder and improving on algorithm architecture, optimizing leading-zero-anticipation and normalization shifter logic, the latency and hardware area is decreased. And by making use of parameterized design and verification methodology, the correction of the QPFMA is verified efficiently. Based on 65 nm technology, as the synthesis results show that the QPFMA can work at 1.2 GHz, with the latency decreased by 3 cycles and the frequency increased by about 11.63% compared with current QPFMA design.

Key words: floating-point arithmetic, multiply-add, Quadruple Precision(QP), high precision, parameterization

中图分类号:

TP368.1

何军，黄永勤，朱英. 一种高性能四倍精度浮点乘加器的设计与实现[J]. 计算机工程, doi: 10.3969/j.issn.1000-3428.2014.02.064.

HE Jun, HUANG Yong-qin, ZHU Ying. Design and Implementation of a High Performance Quadruple Precision Floating-point Multiplier Accumulator[J]. Computer Engineering, doi: 10.3969/j.issn.1000-3428.2014.02.064.

http://www.ecice06.com/CN/Y2014/V40/I2/294

参考文献

参考文献 [1] Bailey D H. High-precision Floating-point Arithmetic in Scientific Computation[J]. Computing in Science and Engineering, 2005, 7(3): 54-61. [2] IEEE Computer Society. IEEE Standard 754-2008 IEEE Standard for Floating-point Arithmetic[S]. 2008. [3] 黎铁军, 李秋亮, 徐炜遐. 一种128位高性能全流水浮点乘加部件[J]. 国防科技大学学报, 2010, 32(2): 56-60. [4] Akkas A, Schulte M J. Dual-mode Floating-point Multiplier Architectures with Parallel Operations[J]. Journal of Systems Architecture, 2006, 52(10): 549-562. [5] Akkas A. Dual-mode Quadruple Precision Floating Point Adder[C]//Proc. of the 9th Euromicro Conference on Digital System Design: Architectures, Methods and Tools. [S. l.]: IEEE Press, 2006: 211-220. [6] Akkas A. A Dual-mode Quadruple Precision Floating-point Divider[C]//Proc. of the 40th Asilomar Conference on Signals, Systems and Computers. [S. l.]: IEEE Press, 2006: 1697-1701. [7] Gok M, Ozbilen M M. Multi-functional Floating-point MAF Designs with Dot Product Support[J]. Microelectronics Journal, 2008, 39(1): 30-43. [8] Huang Libo, Ma Sheng, Shen Li, et al. Low-cost Binary 128 Floating-point FMA Unit Design with SIMD Support[J]. IEEE Transactions on Computers, 2012, 61(5): 745-751. [9] 张峰, 黎铁军, 徐炜遐. 一种128位高精度浮点乘加部件的研究与实现[J]. 计算机工程与科学, 2009, 31(2): 93-103. [10] 雷元武, 窦勇, 郭松. 基于FPGA的高精度科学计算加速器研究[J]. 计算机学报, 2012, 35(1): 112-122. [11] Yu Xiaoyan, Chan Yiu-Hing, Curran B, et al. A 5GHz+ 128-bit Binary Floating-point Adder for the POWER6 Processor[C]// Proc. of the 32nd European Solid-state Circuits Conference. [S. l.]: IEEE Press, 2006: 166-169. [12] Montoye R K, Hokenek E, Runyon S L, et al. Design of the IBM RISC System/6000 Floating-point Execution Unit[J]. IBM Journal of Research and Development, 1990, 34(1): 59- 70. [13] Lang T, Bruguera J D. Floating-point Fused Multiply-add with Reduced Latency[C]//Proc. of IEEE International Conference on Computer Design: VLSI in Computers and Processors. Washington D. C., USA: IEEE Computer Society, 2002. [14] Bruguera J D. Floating-point Fused Multiply-add: Reduced Latency for Floating-point Addition[C]//Proc. of the 17th IEEE Symposium on Computer Arithmetic. [S. l.]: IEEE Press, 2005. [15] Seidel P M. Multiple Path IEEE Floating-point Fused Multiply- add[C]//Proc. of the 46th Midwest Symposium on Circuits and Systems. [S. l.]: IEEE Press, 2003: 1359-1362. [16] Quinnell E. Floating-point Fused Multiply-add Architectures[D]. Austin, USA: University of Texas at Austin, 2007. 编辑顾逸斐

选择文件类型/文献管理软件名称

选择包含的内容

一种高性能四倍精度浮点乘加器的设计与实现

Design and Implementation of a High Performance Quadruple Precision Floating-point Multiplier Accumulator

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	杨世伟, 蒋国平, 宋玉蓉, 涂潇. 基于GPU的稀疏矩阵存储格式优化研究[J]. 计算机工程, 2019, 45(9): 23-31,39.
[2]	夏诗羽,苏科华,陈彩玲. 基于最优传输的网格参数化序列簇生成方法[J]. 计算机工程, 2019, 45(1): 264-269,277.
[3]	高孝杰,张晶晶,高微,张伟. 基于网络模式的北斗高精度定位数据播发[J]. 计算机工程, 2018, 44(9): 296-300.
[4]	阳钧,鲍泓,梁军,马楠. 一种基于高精度地图的路径跟踪方法[J]. 计算机工程, 2018, 44(7): 8-13.
[5]	季挺,张华. 非参数化近似策略迭代并行强化学习算法[J]. 计算机工程, 2018, 44(11): 313-320.
[6]	曹代,郭绍忠,张辛. 基于申威26010处理器的扩展函数库实现与优化[J]. 计算机工程, 2017, 43(1): 61-66,71.
[7]	彭新东,杨勇. 双犹豫模糊软集的研究[J]. 计算机工程, 2015, 41(8): 262-267,272.
[8]	王喜梅,,吕泽均,张涛,颜可壹. 基于高频地波雷达的无角度双站雷达目标跟踪算法[J]. 计算机工程, 2015, 41(3): 312-316.
[9]	杨皓,江南,杜承烈. 基于APIC 的高精度定时器设计[J]. 计算机工程, 2014, 40(9): 317-320.
[10]	刘得金，史峥，胡龙跃. 用于验证工艺开发包的测试芯片自动生成流程[J]. 计算机工程, 2014, 40(2): 314-316.
[11]	刘军，周明全，耿国华，李姬俊男. 基于轮廓与断面匹配的秦俑碎片拼接方法[J]. 计算机工程, 2014, 40(1): 181-185,190.
[12]	何军, 田增, 郭勇, 陈诚. 浮点乘加部件延迟对浮点性能影响的研究[J]. 计算机工程, 2013, 39(7): 311-313,317.
[13]	沈俊, 沈海斌, 虞玉龙. 一种低延迟高吞吐率的浮点整型乘累加单元[J]. 计算机工程, 2013, 39(6): 91-94,102.
[14]	赵慧娟, 孙文辉. 基于退火遗传算法的单元测试方法[J]. 计算机工程, 2013, 39(1): 49-53.
[15]	张旭, 张向群, 赵伟, 何岩峰. 基于最近特征线的二维非参数化判别分析算法[J]. 计算机工程, 2012, 38(14): 171-172.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

一种高性能四倍精度浮点乘加器的设计与实现

Design and Implementation of a High Performance Quadruple Precision Floating-point Multiplier Accumulator

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价