[1] |
JACOB B,KLIGYS S,CHEN Bo,et al.Quantization and training of neural networks for efficient integer-arithmetic-only inference[EB/OL].[2018-07-20].https://arxiv.org/pdf/1712.05877.pdf.
|
[2] |
DETTMERS T.8-bit approximations for parallelism in deep learning[EB/OL].[2018-07-20].https://arxiv.org/pdf/1511.04561.pdf.
|
[3] |
GYSEL P,MOTAMEDI M,GHIASI S.Hardware-oriented approximation of convolutional neural networks[EB/OL].[2018-07-20].https://arxiv.org/pdf/1604.03168.pdf.
|
[4] |
HAN Song,MAO Huizi,DALLY W J.Deep compression:compressing deep neural networks with pruning,trained quantization and huffman coding[EB/OL].[2018-07-20].https://arxiv.org/pdf/1510.00149.pdf.
|
[5] |
NARANG S,DIAMOS G.An update to DeepBench with a focus on deep learning inference[EB/OL].[2018-07-20].https://svail.github.io/DeepBench-update.
|
[6] |
JANG J,CHOI S,PRASANNA V K K.Area and time efficient implementations of matrix multiplication on FPGAs[C]//Proceedings of IEEE International Conference on Field-programmable Technology.Washington D.C.,USA:IEEE Press,2002:93-100.
|
[7] |
CAMPBELL S J,KHATRI S P.Resource and delay efficient matrix multiplication using newer FPGA devices[C]//Proceedings of the 16th ACM Great Lakes Symposium on VLSI.New York,USA:ACM Press,2006:308-311.
|
[8] |
EL-ATFY R,DESSOUKY M A,EL-GHITANI H.Accelerating matrix multiplication on FPGAs[C]//Proceedings of the 2nd International Design and Test Workshop.Washington D.C.,USA:IEEE Press,2007:203-204.
|
[9] |
DAVE N,FLEMING K,KING M,et al.Hardware accele-ration of matrix multiplication on a Xilinx FPGA[C]//Proceedings of IEEE/ACM International Conference on Formal Methods and Models for Codesign.Washington D.C.,USA:IEEE Press,2007:97-100.
|
[10] |
田翔,周凡,陈耀武,等.基于FPGA的实时双精度浮点矩阵乘法器设计[J].浙江大学学报(工学版),2008,42(9):1611-1615.
|
[11] |
张婷.嵌入式环境下浮点矩阵乘法的FPGA加速关键技术研究[D].长沙:湖南大学,2013.
|
[12] |
马邺晨,李醒飞.用于导航解算的矩阵运算硬件加速器设计[J].计算机工程,2014,40(8):259-263.
|
[13] |
Intel Corporation.Intel Xeon Phi delivers competitive performance for deep learning and getting better fast[EB/OL].[2018-07-20].https://software.intel.com/en-us/articles/intel-xeon-phi-delivers-competitive-performance-for-deep-learningand-getting-better-fast.
|
[14] |
MOSS D J M,KRISHNAN S,NURVITADHI E,et al.A customizable matrix multiplication framework for the Intel HARPv2 Xeon+ FPGA platform:a deep learning case study[C]//Proceedings of 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays.New York,USA:ACM Press,2018:107-116.
|
[15] |
KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems.New York,USA:ACM Press,2012:1097-1105.
|
[16] |
Xillybus.Xillybus host application programming guide for Linux[EB/OL].[2018-07-20].http://xillybus.com/downloads/doc/xillybus_host_programming_guide_linux.pdf.
|