Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2022, Vol. 48 ›› Issue (9): 113-120. doi: 10.19678/j.issn.1000-3428.0062417

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Structured Pruning Algorithm with Adaptive Threshold Based on Gradient

WANG Guodong1, YE Jian1,2, XIE Ying1,2, QIAN Yueliang1,2   

  1. 1. Linyi Zhongke Artificial Intelligence Innovation Research Institute, Linyi, Shandong 276000, China;
    2. Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2021-08-19 Revised:2021-10-27 Published:2021-11-02

基于梯度的自适应阈值结构化剪枝算法

王国栋1, 叶剑1,2, 谢萦1,2, 钱跃良1,2   

  1. 1. 临沂中科人工智能创新研究院, 山东 临沂 276000;
    2. 中国科学院计算技术研究所 泛在计算系统研究中心, 北京 100190
  • 作者简介:王国栋(1991—),男,硕士,主研方向为目标检测、模型压缩;叶剑(通信作者),高级工程师、博士;谢萦,高级工程师、硕士;钱跃良,研究员级高级工程师。
  • 基金资助:
    国家重点研发计划(2017YFB1302400);山东省重大科技创新工程项目(2019JZZY020102);江苏省科技计划产业前瞻与共性关键技术竞争项目(BE2018084)。

Abstract: The network model needs to be compressed to reduce the number of model parameters and calculational cost to ensure the operation of the Deep Neural Network(DNN) model on edge equipment and real-time analysis. However, most existing pruning algorithms are time-consuming, and the model compression rate is low.This study proposes a structured pruning algorithm based on gradient to set different thresholds and cut off the redundant parameters of a neural network to the maximum extent.The gradient information of the network is used to measure the importance of the weight.The pruning weight thresholds of different network layers are obtained through grid search and curvature calculations.Redundant parameters that required pruning in the convolution kernel of different network layers are determined according to the residual parameters after the search.Next, the effective number of convolution kernels in each network layer is counted, and the convolution kernels with more effective parameters in the network layer are retained to adjust the number of convolution kernels in different network layers.After the number of convolution kernels is adjusted, the network is retrained to ensure the accuracy of the model.The pruning experiments for VGG16 and ResNet50 classification models and SSD, Yolov4 and MaskRCNN target detection models are conducted.The results show that after pruning with this algorithm, the parameter amounts of the classification models decreased by more than 92%, and their calculation amounts decreased by more than 70%.The parameter amounts of the target detection models decreased by more than 75%, and their calculation amounts decreased by more than 57%.The pruning effect is better than those of Rethinking and PF algorithms, et al.

Key words: model compression, neural network, gradient information, adaptive threshold, structured pruning

摘要: 在边缘设备上运行深度神经网络模型并进行实时性分析,需要对网络模型进行压缩以减少模型参数量和计算量,但现有剪枝算法存在耗时长和模型压缩率低的问题。提出一种基于梯度设置不同阈值的结构化剪枝算法。对神经元进行细粒度评价,采用神经网络的梯度信息衡量权重的重要性,通过网格搜索和计算曲率的方式获取不同网络层的剪枝权重阈值,根据搜索后的剩余参数量确定不同网络层的卷积核中需要剔除的冗余参数。在此基础上,保留网络层中有效参数较多的卷积核,实现对卷积核个数的调整,进而重新训练以保证模型精度。分别对VGG16、ResNet50分类模型和SSD、Yolov4、MaskRCNN目标检测模型进行剪枝实验,结果表明,经该算法剪枝后,分类模型参数量减少92%以上,计算量减少70%以上,目标检测模型参数量减少75%以上,计算量减少57%以上,剪枝效果优于Rethinking、PF等算法。

关键词: 模型压缩, 神经网络, 梯度信息, 自适应阈值, 结构化剪枝

CLC Number: