作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2013, Vol. 39 ›› Issue (4): 287-290,295. doi: 10.3969/j.issn.1000-3428.2013.04.066

• 开发研究与工程应用 • 上一篇    下一篇

基于FPGA线性方程组的存储优化设计

彭 宇,仲雪洁,王少军   

  1. (哈尔滨工业大学电气工程及自动化学院,哈尔滨 150080)
  • 收稿日期:2012-05-09 出版日期:2013-04-15 发布日期:2013-04-12
  • 作者简介:彭 宇(1973-),男,教授、博士生导师,主研方向:可重构计算;仲雪洁,硕士研究生;王少军,博士研究生
  • 基金资助:
    教育部新世纪优秀人才支持计划基金资助项目(NCET-10-0062);教育部高等学校博士学科点专项科研基金资助项目(20092302110013)

Design of Storage Optimization Based on FPGA Linear Equations System

PENG Yu, ZHONG Xue-jie, WANG Shao-jun   

  1. (School of Electrical Engineering and Automation, Harbin Institute of Technology, Harbin 150080, China)
  • Received:2012-05-09 Online:2013-04-15 Published:2013-04-12

摘要: 将基于现场可编程门阵列(FPGA)的改进Cholesky分解应用于大规模线性方程组求解时,会出现存储资源限制和带宽瓶颈问题。为此,提出一种基于层次化存储策略和多端口分块式访问方式的解决方案。结合片内双极随机存取存储器(BRAM)与片外同步动态随机存取存储器(SDRAM),构成分层存储结构,通过片内存储复用降低存储资源需求。采用多端口分块式方式访问片外SDRAM,提高带宽并规避随机数据存取的访问延迟。测试结果表明,相对于Xeon CPU,该方案能够实现17倍~215倍的效率提升。

关键词: 现场可编程门阵列, 线性方程组, 矩阵, 改进Cholesky分解, 带宽

Abstract: When using Cholesky decomposition to solve large-scale linear equations system based on Field Programmable Gate Array(FPGA), the storage size limits and access bandwidth becomes its bottleneck. This paper proposes a solution based on hierarchical storage strategy and multi-ports block access method. The hierarchical storage structure is constituted with internal Bipolar Random Access Memory(BRAM) and external Synchronous Dynamic Random Access Memory(SDRAM). The internal BRAMs are mostly reused to further decrease the storage limits. Accessing external storage through multi-port and block data access methods enhances the effective bandwidth utilization and avoids the random access delay. Experimental results show that, the design can realize 17 to 215 times efficiency speedup compared with Xeon CPU.

Key words: Field Programmable Gate Array(FPGA), linear equations system, matrix, modified Cholesky decomposition, bandwidth

中图分类号: