作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2021, Vol. 47 ›› Issue (4): 153-157. doi: 10.19678/j.issn.1000-3428.0057835

• 体系结构与软件技术 • 上一篇    下一篇

基于RISC-V处理器的卷积加速SoC系统设计

张坤宁1, 赵烁1, 何虎1, 邓宁1, 杨旭2   

  1. 1. 清华大学 微电子学研究所, 北京 100084;
    2. 北京理工大学 软件学院, 北京 100081
  • 收稿日期:2020-03-23 修回日期:2020-05-13 发布日期:2020-04-24
  • 作者简介:张坤宁(1995-),女,硕士,主研方向为卷积加速器设计与优化;赵烁,硕士;何虎(通信作者),副教授、博士;邓宁,教授、博士;杨旭,副教授、博士。
  • 基金资助:
    国家自然科学基金(91846303)。

Design of SoC System for Convolution Acceleration Based on RISC-V Processor

ZHANG Kunning1, ZHAO Shuo1, HE Hu1, DENG Ning1, YANG Xu2   

  1. 1. Institute of Microelectronics, Tsinghua University, Beijing 100084, China;
    2. School of Software, Beijing Institute of Technology, Beijing 100081, China
  • Received:2020-03-23 Revised:2020-05-13 Published:2020-04-24

摘要: 为提高卷积神经网络(CNN)的计算效率和能效,以8 bit定点数据作为输入,设计一个支持激活、批标准化以及池化等CNN网络中常见计算类型的卷积加速器,优化循环计算顺序并将其与数据复用技术相结合,以提高卷积计算的效率。基于软硬件协同设计思想,构建包含RISC-V处理器和卷积加速器的SoC系统,RISC-V处理器基于开源的指令集标准,可以根据具体的设计需求扩展指令功能。将该SoC系统部署在Xilinx ZCU102开发板上,RISC-V处理器和卷积加速器分别工作在100 MHz和300 MHz频率下,测试结果表明,该加速器的算力达到153.6 GOP/s,运行VGG16网络进行图片推理计算时加速效果较好。

关键词: 卷积加速, 循环计算优化, 数据复用, RISC-V处理器, SoC系统, 软硬件协同设计

Abstract: To improve the computation and energy efficiency of Convolutional Neural Network(CNN),this paper proposes a convolution accelerator with 8 bit fixed-point data as input.The accelerator supports common CNN calculations,including activation,Batch Normalization(BN) and pooling.By optimizing the loop computation order and adopting the data reuse strategy,the convolution computation efficiency is greatly improved.Based on the idea of the co-design of software and hardware,a SoC system including a RISC-V processor and the convolution accelerator is designed.The RISC-V processor is based on the open source instruction set,which makes it flexible to add instructions according to specific design requirements.The SoC system is deployed on the Xilinx ZCU102 board,where the RISC-V processor and the accelerator work at the frequency of 100 MHz and 300 MHz,respectively.The testing results show that the computing speed of the accelerator reaches 153.6 GOP/s.It provides a significant speedup for VGG16 network running for inference computation of pictures.

Key words: convolution acceleration, loop computation optimization, data reuse, RISC-V processor, SoC system, co-design of software and hardware

中图分类号: