1 |
SIMONYAN K, ZISSERMAN A. Two-stream convolutional networks for action recognition in videos[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2014: 568-576.
|
2 |
TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2015: 4489-4497.
|
3 |
CARREIRA J, ZISSERMAN A. Quo vadis, action recognition?A new model and the kinetics dataset[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 6299-6308.
|
4 |
ZHOU S C, NI Z K, ZHOU X Y, et al. DoReFa-net: training low bitwidth convolutional neural networks with low bitwidth gradients[EB/OL]. [2022-09-20]. https://arxiv.org/abs/1606.06160.pdf.
|
5 |
JACOB B, KLIGYS S, CHEN B, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]//Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 2704-2713.
|
6 |
|
7 |
LEE J, KIM D, HAM B. Network quantization with element-wise gradient scaling[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 6444-6453.
|
8 |
ZHANG S J, DU Z D, ZHANG L, et al. Cambricon-X: an accelerator for sparse neural networks[C]//Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture. Washington D. C., USA: IEEE Press, 2016: 1-12.
|
9 |
CHEN Y H, KRISHNA T, EMER J S, et al. Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits, 2017, 52(1): 127- 138.
doi: 10.1109/JSSC.2016.2616357
|
10 |
PARASHAR A, RHU M, MUKKARA A, et al. SCNN: An accelerator for compressed-sparse convolutional neural network. ACM SIGARCH Computer Architecture News, 2017, 45(2): 27- 40.
doi: 10.1145/3140659.3080254
|
11 |
RIERA M, ARNAU J M, GONZALEZ A. Computation reuse in DNNs by exploiting input similarity[C]//Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture. Washington D. C., USA: IEEE Press, 2018: 57-68.
|
12 |
YUAN Z, YANG Y X, YUE J S, et al. 14.2 A 65 nm 24.7 µJ/frame 12.3 mW activation-similarity-aware convolutional neural network video processor using hybrid precision, inter-frame data reuse and mixed-bit-width difference-frame data codec[C]//Proceedings of 2020 IEEE International Solid-State Circuits Conference. Washington D. C., USA: IEEE Press, 2020: 232-234.
|
13 |
LI S Z, WANG Q, JIANG J F, et al. An efficient CNN accelerator using inter-frame data reuse of videos on FPGAs. IEEE Transactions on Very Large Scale Integration Systems, 2022, 30(11): 1587- 1600.
doi: 10.1109/TVLSI.2022.3151788
|
14 |
LI S C, WEN W, WANG Y, et al. An FPGA design framework for CNN sparsification and acceleration[C]//Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines. Washington D. C., USA: IEEE Press, 2017: 28-28.
|
15 |
MENG J A, VENKATARAMANAIAH S K, ZHOU C T, et al. FixyFPGA: efficient FPGA accelerator for deep neural networks with high element-wise sparsity and without external memory access[C]//Proceedings of the 31st International Conference on Field-Programmable Logic and Applications. Washington D. C., USA: IEEE Press, 2021: 9-16.
|
16 |
狄新凯, 杨海钢. 基于FPGA的稀疏化卷积神经网络加速器. 计算机工程, 2021, 47(7): 189-195, 204.
URL
|
|
DI X K, YANG H G. FPGA-based accelerator for sparse convolutional neutral network. Computer Engineering, 2021, 47(7): 189-195, 204.
URL
|
17 |
LU L Q, LIANG Y. SpWA: an efficient sparse winograd convolutional neural networks accelerator on FPGAs[C]//Proceedings of the 55th ACM/ESDA/IEEE Design Automation Conference. Washington D. C., USA: IEEE Press, 2018: 1-6.
|
18 |
WANG X A, WANG C, CAO J, et al. WinoNN: optimizing FPGA-based convolutional neural network accelerators using sparse winograd algorithm. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 39(11): 4290- 4302.
doi: 10.1109/TCAD.2020.3012323
|
19 |
YANG T, HE Z Z, KOU T C, et al. BISWSRBS: a winograd-based CNN accelerator with a fine-grained regular sparsity pattern and mixed precision quantization. ACM Transactions on Reconfigurable Technology and Systems, 2021, 14(4): 1- 28.
|
20 |
LAVIN A, GRAY S. Fast algorithms for convolutional neural networks[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 4013-4021.
|
21 |
IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning. Washington D. C., USA: IEEE Press, 2015: 448-456.
|
22 |
CHEN Y H, EMER J, SZE V. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks[C]//Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture. Washington D. C., USA: IEEE Press, 2016, 44(3): 367-379.
|
23 |
AHMAD A, PASHA M A, RAZA G J. Accelerating tiny YOLOv3 using FPGA-based hardware/software co-design[C]//Proceedings of 2020 IEEE International Symposium on Circuits and Systems. Washington D. C., USA: IEEE Press, 2020: 1-5.
|
24 |
HUANG J M, YANG J Y, NUI S, et al. A low-bit quantized and HLS-based neural network FPGA accelerator for object detection[C]//Proceedings of 2021 China Semiconductor Technology International Conference. Washington D. C., USA: IEEE Press, 2021: 11-23.
|
25 |
PESTANA D, MIRANDA P R, LOPES J D, et al. A full featured configurable accelerator for object detection with YOLO. IEEE Access, 2021, 9, 75864- 75877.
doi: 10.1109/ACCESS.2021.3081818
|