摘要: 将基于现场可编程门阵列(FPGA)的改进Cholesky分解应用于大规模线性方程组求解时,会出现存储资源限制和带宽瓶颈问题。为此,提出一种基于层次化存储策略和多端口分块式访问方式的解决方案。结合片内双极随机存取存储器(BRAM)与片外同步动态随机存取存储器(SDRAM),构成分层存储结构,通过片内存储复用降低存储资源需求。采用多端口分块式方式访问片外SDRAM,提高带宽并规避随机数据存取的访问延迟。测试结果表明,相对于Xeon CPU,该方案能够实现17倍~215倍的效率提升。
关键词:
现场可编程门阵列,
线性方程组,
矩阵,
改进Cholesky分解,
带宽
Abstract: When using Cholesky decomposition to solve large-scale linear equations system based on Field Programmable Gate Array(FPGA), the storage size limits and access bandwidth becomes its bottleneck. This paper proposes a solution based on hierarchical storage strategy and multi-ports block access method. The hierarchical storage structure is constituted with internal Bipolar Random Access Memory(BRAM) and external Synchronous Dynamic Random Access Memory(SDRAM). The internal BRAMs are mostly reused to further decrease the storage limits. Accessing external storage through multi-port and block data access methods enhances the effective bandwidth utilization and avoids the random access delay. Experimental results show that, the design can realize 17 to 215 times efficiency speedup compared with Xeon CPU.
Key words:
Field Programmable Gate Array(FPGA),
linear equations system,
matrix,
modified Cholesky decomposition,
bandwidth
中图分类号:
彭宇, 仲雪洁, 王少军. 基于FPGA线性方程组的存储优化设计[J]. 计算机工程, 2013, 39(4): 287-290,295.
BANG Yu, ZHONG Xue-Ji, WANG Shao-Jun. Design of Storage Optimization Based on FPGA Linear Equations System[J]. Computer Engineering, 2013, 39(4): 287-290,295.