结合半波高斯量化与交替更新的神经网络压缩方法

doi:10.19678/j.issn.1000-3428.0057842

计算机工程 ›› 2021, Vol. 47 ›› Issue (5): 80-87. doi: 10.19678/j.issn.1000-3428.0057842

结合半波高斯量化与交替更新的神经网络压缩方法

张红梅, 严海兵, 张向利

桂林电子科技大学广西高校云计算与复杂系统重点实验室, 广西桂林 541004

收稿日期:2020-03-24 修回日期:2020-04-26 发布日期:2020-05-13
作者简介:张红梅(1970-),女,教授、博士,主研方向为网络信息安全、嵌入式系统、智能信息处理;严海兵,硕士研究生;张向利,教授、博士。
基金资助:
国家自然科学基金（61461010）；认知无线电与信息处理省部共建教育部重点实验室基金（CRKL170103，CRKL170104）；广西密码学与信息安全重点实验室基金（GCIS201626）。

Neural Network Compression Method Combining Half-Wave Gaussian Quantization and Alternate Update

ZHANG Hongmei, YAN Haibing, ZHANG Xiangli

Guangxi Colleges and Universities Key Laboratory of Cloud Computing and Complex Systems, Guilin University of Electronic Technology, Guilin, Guangxi 541004, China

Received:2020-03-24 Revised:2020-04-26 Published:2020-05-13

摘要/Abstract

摘要： 为使神经网络模型能在实时性要求较高且内存容量受限的边缘设备上部署使用，提出一种基于半波高斯量化与交替更新的混合压缩方法。对神经网络模型输入部分进行2 bit均匀半波高斯量化，将量化值输入带有缩放因子的二值网络通过训练得到初始二值模型，利用交替更新方法对已训练的二值模型进行逐层微调以提高模型测试精度。在CIFAR-10和ImageNet数据集上的实验结果表明，该方法能有效降低参数和结构冗余所导致的内存和时间开销，在神经网络模型压缩比接近30的前提下，测试精度相比HWGQ-Net方法提高0.8和2.0个百分点且实现了10倍的训练加速。

关键词: 卷积神经网络, 量化, 模型压缩, 半波高斯量化, 交替更新

Abstract: To enable the deployment of neural network models on edge devices with a limited memory size and high real-time performance requirements,this paper proposes a hybrid compression method combining Half-Wave Gaussian Quantization(HWGQ) and alternate update.By performing the 2 bit uniform HWGQ on the input of the neural network model,the quantized value is input into a binary network with a scaling factor,which is trained to obtain the initial binary model.Then the trained binary model is fine-tuned layer by layer using the alternating update method to improve the accuracy of the model.Experimental results on the CIFAR-10 and ImageNet datasets show that the proposed method significantly reduces the memory consumption and time consumption caused by parameter redundancy and structural redundancy.When the model compression ratio is about 30,the accuracy of the model is increased by 0.8 and 2.0 percentage points compared with that of the HWGQ-Net method,and its training speed is increased by 10 times.

Key words: Convolutional Neural Network(CNN), quantization, model compression, Half-Wave Gaussian Quantization(HWGQ), alternate update

中图分类号:

TP183

张红梅, 严海兵, 张向利. 结合半波高斯量化与交替更新的神经网络压缩方法[J]. 计算机工程, 2021, 47(5): 80-87.

ZHANG Hongmei, YAN Haibing, ZHANG Xiangli. Neural Network Compression Method Combining Half-Wave Gaussian Quantization and Alternate Update[J]. Computer Engineering, 2021, 47(5): 80-87.

http://www.ecice06.com/CN/Y2021/V47/I5/80

参考文献

[1] GUPTA S,AGRAWAL A,GOPALAKRISHNAN K,et al.Deep learning with limited numerical precision[C]//Proceedings of International Conference on Machine Learning.New York,USA:ACM Press,2015:1737-1746.
[2] COURBARIAUX M,BENGIO Y,DAVID J P.BinaryConnect:training deep neural networks with binary weights during propagation[C]//Proceedings of International Conference on Neural Information Processing Systems.Cambridge,USA:MIT Press,2015:3123-3131.
[3] COURBARIAUX M,HUBARA I,SOUDRY D,et al.Binarized neural networks:training deep neural networks with weights and activations constrained to +1 or -1[EB/OL].[2020-02-10].https://arxiv.org/abs/1602.02830.
[4] RASTEGARI M,ORDONEZ V,REDMON J,et al.XNOR-Net:ImageNet classification using binary convolutional neural networks[C]//Proceedings of European Conference on Computer Vision.Berlin,Germany:Springer,2016:525-542.
[5] CAI Zhaowei,HE Xiaodong,SUN Jian,et al.Deep learning with low precision by half-wave Gaussian quantization[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2017:5918-5926.
[6] LI Fengfu,ZHANG Bo,LIU Bin.Ternary weight networks[EB/OL].[2020-02-10].https://arxiv.org/abs/1605.04711.
[7] ZHU Chenzhuo,HAN Song,MAO Huizi,et al.Trained ternary quantization[EB/OL].[2020-02-10].https://arxiv.org/pdf/1612.01064.pdf.
[8] ZHOU Shuchang,WU Yuxin,NI Zekun,et al.Dorefa-Net:training low bitwidth convolutional neural networks with low bitwidth gradients[EB/OL].[2020-02-10].https://arxiv.org/pdf/1606.06160.pdf.
[9] ZHOU Aojun,YAO Anbang,GUO Yiwen,et al.Incremental network quantization:towards lossless CNNs with low-precision weights[EB/OL].[2020-02-10].https://arxiv.org/pdf/1702.03044.pdf.
[10] HU Qinghao,WANG Peisong,CHENG Jian.From hashing to CNNs:training binary weight networks via hashing[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Palo Alto,USA:AAAI Press,2018:3247-3254.
[11] LIN D D,TALATHI S S.Overcoming challenges in fixed point training of deep convolutional networks[EB/OL].[2020-02-10].https://arxiv.org/abs/1607.02241.
[12] GLOROT X,BORDES A,BENGIO Y.Deep sparse rectifier neural networks[C]//Proceedings of the 14th International Conference on Artificial Intelligences and Statistics.Washington D.C.,USA:IEEE Press,2011:315-323.
[13] LLOYD S.Least squares quantization in PCM[J].IEEE Transactions on Information Theory,1982,28(2):129-137.
[14] IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[EB/OL].[2020-02-10].https://arxiv.org/pdf/1502.03167.pdf.
[15] PASCANU R,MIKOLOV T,BENGIO Y.On the difficulty of training recurrent neural networks[C]//Proceedings of International Conference on Machine Learning.Washington D.C.,USA:IEEE Press,2013:1310-1318.
[16] LI Zefan,NI Bingbing.Performance guaranteed network acceleration via high-order residual quantization[EB/OL].[2020-02-10].https://arxiv.org/abs/1708.08687.
[17] WU Jiaxiang,LENG Cong,WANG Yuhang,et al.Quantized convolutional neural networks for mobile devices[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:4820-4828.
[18] SHEN Fumin,SHEN Chunhua,LIU Wei,et al.Supervised discrete hashing[C]//Proceedings of IEEE Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2015:37-45.
[19] RUSSAKOVSKY O,DENG J,SU H,et al.ImageNet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[20] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[21] KINGMA D P,BA J.Adam:a method for stochastic optimization[EB/OL].[2020-02-10].https://arxiv.org/pdf/1412.6980.pdf.

选择文件类型/文献管理软件名称

选择包含的内容

结合半波高斯量化与交替更新的神经网络压缩方法

Neural Network Compression Method Combining Half-Wave Gaussian Quantization and Alternate Update

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	曹坪, 杨怀志, 薄一军, 尤嘉, 张淳杰, 李丹勇. 面向低质量裂缝图像的多知识蒸馏分类[J]. 计算机工程, 2023, 49(7): 204-213.
[2]	白明昌. 基于折叠路径聚合的属性网络节点嵌入方法[J]. 计算机工程, 2023, 49(7): 76-84.
[3]	徐正梅, 刘华明, 毕学慧, 王亚. 基于特征优化的无参考光场图像质量评价[J]. 计算机工程, 2023, 49(7): 242-250.
[4]	顾轶寅, 王鸿奎, 殷海兵. 基于上下文自适应阈值剪枝的快速依赖量化算法[J]. 计算机工程, 2023, 49(7): 143-149.
[5]	代祖华, 刘园园, 狄世龙. 语义增强的图神经网络方面级文本情感分析[J]. 计算机工程, 2023, 49(6): 71-80.
[6]	沈学利, 田桂源, 姜彦吉, 马琳琳. 基于双阶段Conv-Transformer的时频域语音增强算法[J]. 计算机工程, 2023, 49(6): 123-130.
[7]	丁子轩, 俞雷, 张娟, 李想, 王新宇. 基于深度残差自适应注意力网络的图像超分辨率重建[J]. 计算机工程, 2023, 49(5): 231-238.
[8]	李宜亭, 屈丹, 杨绪魁, 张昊, 沈小龙. 基于分解门控注意力单元的高效Conformer模型[J]. 计算机工程, 2023, 49(5): 73-80.
[9]	陈治旭, 靳雁霞, 芦烨, 杨晶, 刘亚变, 史志儒. 基于子图卷积神经网络的多精度服装建模方法[J]. 计算机工程, 2023, 49(4): 174-181.
[10]	徐康, 李霏, 姬东鸿. 结合依存图卷积与文本片段搜索的方面情感三元组抽取[J]. 计算机工程, 2023, 49(4): 61-67.
[11]	郭奕裕, 周箩鱼. 安全帽佩戴检测网络模型的轻量化设计[J]. 计算机工程, 2023, 49(4): 312-320.
[12]	衡红军, 苗菁. 语义与句法信息加强的二元标记实体关系联合抽取[J]. 计算机工程, 2023, 49(4): 77-84.
[13]	钟宝荣, 吴夏灵. 基于高分辨率网络的轻量型人体姿态估计研究[J]. 计算机工程, 2023, 49(4): 226-232,239.
[14]	杨晶晶, 谢海燕, 薛妮妮, 张傲明. 基于双通道残差网络的水下图像去噪研究[J]. 计算机工程, 2023, 49(4): 188-198.
[15]	唐敏, 张宇浩, 邓国强. 一种高效的非交互式隐私保护逻辑回归模型[J]. 计算机工程, 2023, 49(4): 32-42,51.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

结合半波高斯量化与交替更新的神经网络压缩方法

Neural Network Compression Method Combining Half-Wave Gaussian Quantization and Alternate Update

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价