Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2010, Vol. 36 ›› Issue (11): 217-220. doi: 10.3969/j.issn.1000-3428.2010.11.079

• Networks and Communications • Previous Articles     Next Articles

Design and Implementation of Memory System for Tiled Stream Processor

WANG Fang1, AN Hong1,2, XU Guang1, XU Mu1, YAO Ping1   

  1. (1. School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027; 2. Key Laboratory of Computer System and Architecture, Chinese Academy of Sciences, Beijing 100080)
  • Online:2010-06-05 Published:2010-06-05

分片式流处理器上存储系统的设计与实现

汪 芳1,安 虹1,2,徐 光1,许 牧1,姚 平1   

  1. (1. 中国科学技术大学计算机科学技术学院,合肥 230027;2. 中国科学院计算机系统结构重点实验室,北京 100080)
  • 作者简介:汪 芳(1984-),女,硕士研究生,主研方向:高性能处理器体系结构;安 虹,副教授;徐 光,博士研究生;许 牧、姚 平,硕士研究生
  • 基金资助:
    国家自然科学基金资助重点项目(60633040, 60736012);国家“973”计划基金资助项目(2005CB321601);国家“863”计划基金资助项目(2006AA01A102, 2009AA01Z106);教育部-英特尔信息技术专项科研基金资助项目(MOE-INTEL-08-07)

Abstract: Aiming at the consideration of improving the utilizing efficiency of off-chip bandwidth and resolving “Memory Wall” problem, this paper designs a Data-Parallel Memory System(DPMS) for the tiled stream processor in current project. This memory system can reduce the time costs of off-chip memory access and meet the needs of off-chip bandwidth. Results of software simulation and emulation verification indicate that for different workloads, the design can fully capture the row-locality and bank-parallelism of memory access, by optimizing the configuration parameters and further boost the utilizing efficiency of DRAM bandwidth.

Key words: tiled stream processor, Data-Parallel Memory System(DPMS), off-chip bandwidth

摘要: 针对“存储墙”问题,从提高片外带宽使用率的角度出发,为分片式流处理器设计实现数据并行存储系统。该存储系统通过多级调度能有效减少片外访存的次数,降低片外带宽的需求。软件模拟和仿真验证的结果表明,在不同工作负载特征下,通过设计参数的优化选择,该设计能够充分挖掘存储访问的行局部性和体间并行性,从而提高带宽的使用效率。

关键词: 分片式流处理, 数据并行存储系, 片外带宽

CLC Number: