面向图像目标识别的轻量化卷积神经网络

doi:10.19678/j.issn.1000-3428.0061569

摘要/Abstract

摘要： 传统图像目标识别模型通常使用结构复杂、层数更深的神经网络以提升其在计算机视觉领域的准确率，但该类模型存在对计算机算力要求过高、占用内存较大、无法部署在手机等小型计算机上的问题。提出一种轻量化卷积神经网络ConcatNet，采用特征拼接的方式，通过多支路并行将通道注意力机制与深度可分离卷积相结合，在增强有效特征权重的基础上，降低模型的参数量和复杂度，实现网络的轻量化。在网络输出阶段，采用先筛选再混洗的方式提高模型的识别精度。利用全局平均池化和全局随机池化提取中间特征图的信息，其中全局平均池化可以较好地保留背景信息，全局随机池化按概率值选取特征，具有较强的泛化性，两者相结合能够减少信息的丢失。在CIFAR-10、CIFAR-100等数据集上的实验结果表明，与MobileNetV2等轻量化神经网络相比，ConcatNet网络在保持Top-1和Top-5精度相当的情况下，将参数量和计算复杂度均降低了约50%，极大降低了对承载设备的要求。

关键词: 轻量化, 通道注意力, 深度可分离卷积, 通道混洗, 特征拼接

Abstract: Traditional image target recognition models typically use neural networks with complex structures and deeper layers to improve their accuracy in the computer vision field.However, such models have high computing power requirements and large memory occupations, and are undeployable on small computers, such as in mobile phones.This study proposes ConcatNet, which is a lightweight Convolutional Neural Network (CNN) that adopts a manner of feature stitching and combines a channel attention mechanism with depth-separable convolution using parallel multi-branching.Enhancing the effective feature weight, ConcatNet reduces the parameter quantity and complexity of the model and realizes a lighter network.In addition, in the network output stage, the method of first screening and then shuffling is adopted to improve the recognition accuracy of the model.Simultaneously, the information of the intermediate feature map is extracted using global average and global random pooling.Global average pooling better retains background information, whereas global random pooling selects features according to the probability value, which has strong generalization.Combining the two can reduce information loss.The experimental results on CIFAR-10 and CIFAR-100 datasets show that, compared with lightweight neural networks such as MobileNetV2, the results show that the network reduces the amount of parameters and computational complexity by approximately 50%, maintains the same accuracy of Top-1 and Top-5, and greatly reduces the requirements for load-bearing equipment.

Key words: lightweight, channel attention, depth separable convolution, channel shuffle, feature stitching

中图分类号:

TP391.4

史宝岱, 张秦, 李瑶, 李宇环. 面向图像目标识别的轻量化卷积神经网络[J]. 计算机工程, 2022, 48(6): 257-262.

SHI Baodai, ZHANG Qin, LI Yao, LI Yuhuan. Lightweight Convolution Neural Network for Image Target Recognition[J]. Computer Engineering, 2022, 48(6): 257-262.

https://www.ecice06.com/CN/Y2022/V48/I6/257

图/表 9

20220625181037

20220625181041

20220625181046

20220625181050

20220625181054

20220625181059

20220625181105

20220625181110

20220625181114

参考文献

[1] HINTON G E, SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science, 2006, 313(5):504-507.
[2] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al.Generative adversarial networks[J].Communications of the ACM, 2020, 63(11):139-144.
[3] SCHOLKOPF B, PLATT J, HOFMANN T.Greedy layer-wise training of deep networks[C]//Proceedings of International Conference on Neural Information Processing Systems.Washington D.C., USA:IEEE Press, 2006:153-160.
[4] LECUN Y, BOTTOU L, BENGIO Y, et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE, 1998, 86(11):2278-2324.
[5] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM, 2017, 60(6):84-90.
[6] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:770-778.
[7] 黄星元.轻量化卷积神经网络及其应用的研究[D].上海:上海师范大学, 2020. HUANG X Y.Research on lightweight convolutional neural network and the application[D].Shanghai:Shanghai Normal University, 2020.(in Chinese)
[8] HOWARD A G, ZHU M L, CHEN B, et al.MobileNets:efficient convolutional neural networks for mobile vision applications[EB/OL].[2021-04-08].https://arxiv.org/abs/1704.04861.
[9] SANDLER M, HOWARD A, ZHU M L, et al.MobileNetV2:inverted residuals and linear bottlenecks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:4510-4520.
[10] HOWARD A, SANDLER M, CHEN B, et al.Searching for MobileNetV3[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:1314-1324.
[11] TAN M X, CHEN B, PANG R M, et al.MnasNet:platform-aware neural architecture search for mobile[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2019:2815-2823.
[12] YANG T J, HOWARD A, CHEN B, et al.NetAdapt:platform-aware neural network adaptation for mobile applications[C]//Proceedings of the European Conference on Computer Vision.Berlin, Germany:Springer, 2018:285-300.
[13] CHOLLET F.Xception:deep learning with depthwise separable convolutions[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:1800-1807.
[14] SZEGEDY C, VANHOUCKE V, IOFFE S, et al.Rethinking the inception architecture for computer vision[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:2818-2826.
[15] ZHANG X Y, ZHOU X Y, LIN M X, et al.ShuffleNet:an extremely efficient convolutional neural network for mobile devices[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:6848-6856.
[16] MA N N, ZHANG X Y, ZHENG H T, et al.ShuffleNet V2:practical guidelines for efficient CNN architecture design[EB/OL].[2021-04-08].https://arxiv.org/abs/1807.11164v1.
[17] CHEN L Q, WANG D, GAN Z, et al.Wasserstein contrastive representation distillation[EB/OL].[2021-04-08].https://arxiv.org/abs/2012.08674.
[18] HANSON S, PRATT L.Comparing biases for minimal network construction with back-propagation[J].Advances in Neural Information Processing Systems, 2007, 23:1009-1016.
[19] 赵蓉, 唐楚淇, 刘伟林, 等.一种新的基于灰色关联分析的BP神经网络剪枝算法[J].科技创新与应用, 2016(13):17-18. ZHAO R, TANG C Q, LIU W L, et al.A new pruning algorithm of BP neural network based on grey correlation analysis[J].Technology Innovation and Application, 2016(13):17-18.(in Chinese)
[20] 朱张莉, 饶元, 吴渊, 等.注意力机制在深度学习中的研究进展[J].中文信息学报, 2019, 33(6):1-11. ZHU Z l, RAO Y, WU Y, et al.Research progress of attention mechanism in deep learning[J].Journal of Chinese Information Processing, 2019, 33(6):1-11.(in Chinese)
[21] HU J, SHEN L, SUN G.Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7132-7141.
[22] WANG Q L, WU B G, ZHU P F, et al.ECA-net:efficient channel attention for deep convolutional neural networks[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:11531-11539.
[23] KRIZHEVSKY A, HINTON G.Learning multiple layers of features from tiny images[EB/OL].[2021-04-08].https://www.researchgate.net/publication/306218037_Learning_multiple_layers_of_features_from_tiny_images.

选择文件类型/文献管理软件名称

选择包含的内容