作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 体系结构与软件技术 • 上一篇    下一篇

面向MPI集合操作的定制化片上网络

陆思羽,王宏伟,张悠慧,杨广文,郑纬民   

  1. (清华大学 计算机科学与技术系,北京 100084)
  • 收稿日期:2016-05-16 出版日期:2017-06-15 发布日期:2017-06-15
  • 作者简介:陆思羽(1990—),女,硕士研究生,主研方向为计算机体系结构;王宏伟,硕士;张悠慧、杨广文、郑纬民,教授。
  • 基金资助:
    国家“863”计划项目(2013AA01A215)。

Customized Network-on-Chip Oriented to MPI Collective Operations

LU Siyu,WANG Hongwei,ZHANG Youhui,YANG Guangwen,ZHENG Weimin   

  1. (Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China)
  • Received:2016-05-16 Online:2017-06-15 Published:2017-06-15

摘要: 根据计算趋近数据的原则,提出面向MPI集合操作的定制化片上网络设计方法,通过增强现有片上路由器的硬件功能实现MPI集合操作在网络层的加速。设计MPI归约操作,将其扩展至多种集合操作,并与一种针对确定性路由算法且可动态学习消息传输路径的自适应方法相结合,使集合操作可在扩展后的片上路由器上完成,加速处理过程并减少处理器核负载。此外,提出片上路由器的微体系结构设计方法,比较不同片上网络中扩展后的片上路由器布局并评估相应性能、功耗和片上面积。测试结果表明,与基于软件的最优实现相比,该方法在仅消耗有限功耗与片上面积的基础上,可使MPI归约性能提升6.4~41.7倍,广播性能提升15.3~31.2倍,全局归约性能提升5.4~9.7倍,收集性能提升1.3~1.8倍。

关键词: 片上网络, 片上多核处理器, 消息传递接口, 集合操作, 定制化

Abstract: According to the principle of computations approaching data,this paper proposes a design method of Network-on-Chip(NoC) oriented to MPI collective operations,which focuses on the hardware enhancement of common NoC routers to speed up MPI collective operations on the network layer.It designs MPI reduction,extends it to support more operations and combines it with an adaptive method for the deterministic routing algorithm,which can learn transmission paths of messages dynamically.Thus,enhanced routers can complete message processing in place,which not only speed up the processing procedure but also coalesce messages.The design method for detailed micro-architecture of NoC is presented.Different layout strategies of enhanced routers are compared and the corresponding performance,power consumption and extra chip-area are evaluated.Testing results show that,compared with ideal software-based implementation,the proposed method can improve the reduction performance by 6.4~41.7 times,broadcast by 15.3~31.2,global reduction by 5.4~9.7 times,and gather by 1.3~1.8 times,while the consumption of power and chip-area is limited.

Key words: Network-on-Chip(NoC), Chip Multi-Processor(CMP), Message Passing Interface(MPI), collective operation, customization

中图分类号: