作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (16): 249-252. doi: 10.3969/j.issn.1000-3428.2012.16.065

• 工程应用技术与实现 • 上一篇    下一篇

高性能子字并行运算单元的设计与实现

董 冕,吴 丹,饶金理,黄 威,戴 葵,邹雪城   

  1. (华中科技大学电子科学与技术系,武汉 430074)
  • 收稿日期:2012-01-05 出版日期:2012-08-20 发布日期:2012-08-17
  • 作者简介:董 冕(1988-),男,硕士研究生,主研方向:大规模集成电路设计,计算机体系结构;吴 丹,博士研究生;饶金理、黄 威,硕士研究生;戴 葵、邹雪城,教授、博士生导师
  • 基金资助:
    国家自然科学基金资助项目(NSFC 60976027, 60973035);湖北省自然科学基金资助项目(ZRZ0051, 2010CDB02705)

Design and Implementation of High Performance Subword-Parallel Arithmetic Units

DONG Mian, WU Dan, RAO Jin-li, HUANG Wei, DAI Kui, ZOU Xue-cheng   

  1. (Department of Electronic Science & Technology, Huazhong University of Science & Technology, Wuhan 430074, China)
  • Received:2012-01-05 Online:2012-08-20 Published:2012-08-17

摘要: 通过硬件共享的方式实现一套高性能子字并行运算单元,运算单元采用流水线设计,可以一个周期进行1个64-bit、2个32-bit、4个16-bit或8个8-bit定点运算,1个双精度或2个单精度浮点运算。运算单元采用Verilog HDL设计,在0.18 μm 标准CMOS工艺库下实现,并针对实际多媒体应用程序基于ESCA系统进行性能评测。实验结果表明,该运算单元可以在硬件开销和性能上获得较好的平衡。

关键词: 多媒体技术, 子字并行, 硬件共享, 运算单元, ESCA系统, 协处理器

Abstract: A set of subword-parallel arithmetic units is implemented with a hardware shared method. With pipelined design, the proposed units can perform one 64-bit, two 32-bit, four 16-bit, eight 8-bit fixed-point operations, or one double-precision, two single-precision floating-point operations in single cycle. The arithmetic units are designed with Verilog HDL and implemented in 0.18μm standard CMOS process. The performance is evaluated by a real multimedia application based on Engineering and Scientific Computing Accelerator(ESCA) system. Experimental results show that the subword-parallel arithmetic units have a good tradeoff between hardware cost and performance.

Key words: multimedia technique, subword parallel, hardware sharing, arithmetic units, Engineering and Scientific Computing Accelerator (ESCA) system, co-processor

中图分类号: