Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2026, Vol. 52 ›› Issue (3): 276-286. doi: 10.19678/j.issn.1000-3428.0070244

• Computer Architecture and Advanced Computing • Previous Articles     Next Articles

Optimization of Posit Multiplication Unit Based on Improved Wallace Tree

GAO Zhiyong1, WANG Lei1,*(), LIU Bowen2, YING Jinrui2, WANG Panlong2   

  1. 1. School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, Henan, China
    2. School of Cyberspace Security, Zhongyuan University of Technology, Zhengzhou 450007, Henan, China
  • Received:2024-08-13 Revised:2024-09-15 Online:2026-03-15 Published:2024-12-10
  • Contact: WANG Lei

基于改进Wallace树的Posit乘法单元优化

高志勇1, 王磊1,*(), 刘博文2, 英津瑞2, 王盼龙2   

  1. 1. 郑州大学计算机与人工智能学院, 河南 郑州 450001
    2. 中原工学院网络空间安全学院, 河南 郑州 450007
  • 通讯作者: 王磊
  • 作者简介:

    高志勇,男,硕士研究生,主研方向为高性能计算

    王磊(通信作者),教授

    刘博文,硕士研究生

    英津瑞,硕士研究生

    王盼龙,硕士研究生

  • 基金资助:
    中国科学院软件研究所合作项目(22001742)

Abstract:

The Posit format, a novel floating-point representation, offers significant advantages over the IEEE 754 standard in terms of dynamic range and rounding error management. However, its hardware implementation, particularly the design of the mantissa multiplier, poses challenges. Therefore, this paper introduces an enhanced Wallace tree algorithm named 3L-Wallace tree, which reduces the number of stages in partial product summation, thereby decreasing both hardware resource consumption and overall latency. This improvement is achieved by adding specific counters, redesigning the layout of the partial product summation stage counters, and enhancing the adders used in the final summation stage. Furthermore, the paper implements the 3L-Wallace tree in the optimization of the Posit multiplication unit. Additionally, a modular design approach is introduced, dividing large bit-width multipliers into smaller, more manageable modules, thereby simplifying the design process and easing implementation difficulties. A dynamic selection algorithm is also designed, which dynamically selects multipliers of appropriate bit-width based on runtime mantissa width to avoid hardware resource waste. Experimental results show that the 3L-Wallace tree algorithm reduces hardware resource consumption by an average of 9.5%, power consumption by an average of 8.1%, and latency by an average of 10.4%, outperforming traditional methods, particularly in the implementation of large bit-width multipliers.

Key words: Posit format, Wallace tree, multiplication unit, mantissa multiplier, counters, Field-Programmable Gate Array (FPGA)

摘要:

Posit格式作为一种新的浮点数表示方法, 虽然在动态范围和舍入误差处理上相比IEEE 754浮点格式具有显著优势, 但其硬件实现尤其是尾数乘法器的设计存在挑战。为此, 通过增加特定的计数器、重新设计部分积求和阶段计数器布局以及改进最终求和阶段使用的加法器, 提出一种名为3L-Wallace树的改进Wallace树算法, 以减少部分积求和的阶段数, 降低硬件资源消耗和整体延迟。随后, 基于3L-Wallace树对Posit乘法单元进行了优化。此外, 还引入模块化设计方法, 将大位宽乘法器划分为更易于实现的小模块, 简化了设计过程并减小了实现难度。同时, 设计一种动态选择算法, 根据运行时尾数位宽动态选择合适位宽的乘法器, 避免硬件资源浪费。实验结果显示, 3L-Wallace树算法硬件资源消耗相较于传统方法平均减少约9.5%, 功率平均降低约8.1%, 时延平均降低约10.4%, 整体表现优于传统方法, 特别是在大位宽乘法器的实现上表现突出。

关键词: Posit格式, Wallace树, 乘法单元, 尾数乘法器, 计数器, 现场可编程门阵列