Structured Pruning Algorithm with Adaptive Threshold Based on Gradient

doi:10.19678/j.issn.1000-3428.0062417

Abstract

Abstract: The network model needs to be compressed to reduce the number of model parameters and calculational cost to ensure the operation of the Deep Neural Network(DNN) model on edge equipment and real-time analysis. However, most existing pruning algorithms are time-consuming, and the model compression rate is low.This study proposes a structured pruning algorithm based on gradient to set different thresholds and cut off the redundant parameters of a neural network to the maximum extent.The gradient information of the network is used to measure the importance of the weight.The pruning weight thresholds of different network layers are obtained through grid search and curvature calculations.Redundant parameters that required pruning in the convolution kernel of different network layers are determined according to the residual parameters after the search.Next, the effective number of convolution kernels in each network layer is counted, and the convolution kernels with more effective parameters in the network layer are retained to adjust the number of convolution kernels in different network layers.After the number of convolution kernels is adjusted, the network is retrained to ensure the accuracy of the model.The pruning experiments for VGG16 and ResNet50 classification models and SSD, Yolov4 and MaskRCNN target detection models are conducted.The results show that after pruning with this algorithm, the parameter amounts of the classification models decreased by more than 92%, and their calculation amounts decreased by more than 70%.The parameter amounts of the target detection models decreased by more than 75%, and their calculation amounts decreased by more than 57%.The pruning effect is better than those of Rethinking and PF algorithms, et al.

Key words: model compression, neural network, gradient information, adaptive threshold, structured pruning

摘要： 在边缘设备上运行深度神经网络模型并进行实时性分析，需要对网络模型进行压缩以减少模型参数量和计算量，但现有剪枝算法存在耗时长和模型压缩率低的问题。提出一种基于梯度设置不同阈值的结构化剪枝算法。对神经元进行细粒度评价，采用神经网络的梯度信息衡量权重的重要性，通过网格搜索和计算曲率的方式获取不同网络层的剪枝权重阈值，根据搜索后的剩余参数量确定不同网络层的卷积核中需要剔除的冗余参数。在此基础上，保留网络层中有效参数较多的卷积核，实现对卷积核个数的调整，进而重新训练以保证模型精度。分别对VGG16、ResNet50分类模型和SSD、Yolov4、MaskRCNN目标检测模型进行剪枝实验，结果表明，经该算法剪枝后，分类模型参数量减少92%以上，计算量减少70%以上，目标检测模型参数量减少75%以上，计算量减少57%以上，剪枝效果优于Rethinking、PF等算法。

关键词: 模型压缩, 神经网络, 梯度信息, 自适应阈值, 结构化剪枝

CLC Number:

TP391

WANG Guodong, YE Jian, XIE Ying, QIAN Yueliang. Structured Pruning Algorithm with Adaptive Threshold Based on Gradient[J]. Computer Engineering, 2022, 48(9): 113-120.

王国栋, 叶剑, 谢萦, 钱跃良. 基于梯度的自适应阈值结构化剪枝算法[J]. 计算机工程, 2022, 48(9): 113-120.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0062417

http://www.ecice06.com/EN/Y2022/V48/I9/113

Figures/Tables 7

References

[1] DENTON E, ZAREMBA W, BRUNA J, et al.Exploiting linear structure within convolutional networks for efficient evaluation[C]//Proceedings of the 28th Conference on Neural Information Processing Systems.Montreal, Canada:[s.n.], 2014:1269-1277.
[2] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al.Improving neural networks by preventing co-adaptation of feature detectors[J].Computer Science, 2012, 3(4):212-223.
[3] LI H, KADAV A, DURDANOVIC I, et al.Pruning filters for efficient ConvNets[C]//Proceedings of International Conference on Learning Representation.Toulon, France:[s.n.], 2016:1-10.
[4] LIU Z, LI J G, SHEN Z Q, et al.Learning efficient convolutional networks through network slimming[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2017:2755-2763.
[5] HAN S, POOL J, TRAN J, et al.Learning both weights and connections for efficient neural networks[EB/OL].(2015-10-30)[2021-07-13].https://arxiv.org/abs/1506.02626.
[6] WEN W, WU C, WANG Y, et al.Learning structured sparsity in deep neural networks[J].Advances in Neural Information Processing Systems, 2016, 29:2074-2082.
[7] MOLCHANOV P, TYREE S, KARRAS T, et al.Pruning convolutional neural networks for resource efficient inference[EB/OL].(2017-06-08)[2021-07-13].https://arxiv.org/abs/1611.06440.
[8] LEE N, AJANTHAN T, TORR P H S.SNIP:single-shot network pruning based on connection sensitivity[EB/OL].(2019-02-23)[2021-07-13].https://arxiv.org/abs/1810.02340.
[9] CHANDAKKAR P S, LI Y K, DING P L K, et al.Strategies for re-training a pruned neural network in an edge computing paradigm[C]//Proceedings of IEEE International Conference on Edge Computing.Washington D.C., USA:IEEE Press, 2017:244-247.
[10] LI Y C, LIN S H, ZHANG B C, et al.Exploiting kernel sparsity and entropy for interpretable CNN compression[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:2795-2804.
[11] LUO J H, WU J X.An entropy-based pruning method for CNN compression[EB/OL].(2017-06-19)[2021-07-13].https://arxiv.org/abs/1706.05791.
[12] ZHANG X, HE Y, JIAN S.Channel pruning for accelerating very deep neural networks[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press:2017:1398-1406.
[13] YU R, LI A, CHEN C F, et al.NISP:pruning networks using neuron importance score propagation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press:2018:9194-9203.
[14] SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].(2015-04-10)[2021-07-13].https://arxiv.org/abs/1409.1556.
[15] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[C]//Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.Berlin, Germany:Springer, 2016:770-778.
[16] LIU W, ANGUELOV D, ERHAN D, et al.SSD:single shot multibox detector[C]//Proceedings of the 14th Europeam Conference on Computer Vision.Berlin, Germany:Springer, 2016:21-37.
[17] BOCHKOVSKIY A, WANG C Y, LIAO H Y M.YOLOv4:optimal speed and accuracy of object detection[EB/OL].(2020-04-23)[2021-07-13].https://arxiv.org/abs/2004.10934.
[18] HE K M, GKIOXARI G, DOLLÁR P, et al.Mask R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2017:2980-2988.
[19] HE Y, DONG X Y, KANG G L, et al.Asymptotic soft filter pruning for deep convolutional neural networks[J].IEEE Transactions on Cybernetics, 2020, 50(8):3594-3604.
[20] LIU Z, SUN M J, ZHOU T H, et al.Rethinking the value of network pruning[EB/OL].(2019-03-05)[2021-07-13].https://arxiv.org/abs/1810.05270.
[21] DONG X Y, HUANG J S, YANG Y, et al.More is less:a more complicated network with less inference complexity[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:1895-1903.
[22] TZELEPIS G, ASIF A, BACI S, et al.Deep neural network compression for image classification and object detection[EB/OL].(2019-10-07)[2021-10-09].https://arxiv.org/abs/1910.02747.
[23] LI Z S, SUN Y R, TIAN G Z, et al.A compression pipeline for one-stage object detection model[J].Journal of Real-Time Image Processing, 2021, 18(6):1949-1962.
[24] 张江永, 徐智勇, 张建林, 等.基于敏感度的YOLO网络集成剪枝算法[J].计算机工程, 2021, 47(9):59-68. ZHANG J Y, XU Z Y, ZHANG J L, et al.Sensitivity-based integrated pruning algorithm for YOLO network[J].Computer Engineering, 2021, 47(9):59-68.(in Chinese)
[25] LIU H S, FAN K G, OUYANG Q H, et al.Real-time small drones detection based on pruned YOLOv4[J].Sensors (Basel, Switzerland), 2021, 21(10):3374.
[26] 杨民杰, 梁亚玲, 杜明辉.基于参数子空间和缩放因子的YOLO剪枝算法[J].计算机工程, 2021, 47(2):111-117. YANG M J, LIANG Y L, DU M H.YOLO pruning algorithm based on parameter subspace and scaling factor[J].Computer Engineering, 2021, 47(2):111-117.(in Chinese)

Please choose a citation manager

Content to export