作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (6): 257-262. doi: 10.19678/j.issn.1000-3428.0061569

• 图形图像处理 • 上一篇    下一篇

面向图像目标识别的轻量化卷积神经网络

史宝岱, 张秦, 李瑶, 李宇环   

  1. 空军工程大学 研究生学院, 西安 710051
  • 收稿日期:2021-05-08 修回日期:2021-07-19 发布日期:2021-07-20
  • 作者简介:史宝岱(1996—),男,硕士研究生,主研方向为图像目标识别;张秦,教授、博士;李瑶、李宇环,硕士研究生。
  • 基金资助:
    国家自然科学基金面上项目(61971438)。

Lightweight Convolution Neural Network for Image Target Recognition

SHI Baodai, ZHANG Qin, LI Yao, LI Yuhuan   

  1. School of Graduate Air Force Engineering University, Xi'an 710051, China
  • Received:2021-05-08 Revised:2021-07-19 Published:2021-07-20

摘要: 传统图像目标识别模型通常使用结构复杂、层数更深的神经网络以提升其在计算机视觉领域的准确率,但该类模型存在对计算机算力要求过高、占用内存较大、无法部署在手机等小型计算机上的问题。提出一种轻量化卷积神经网络ConcatNet,采用特征拼接的方式,通过多支路并行将通道注意力机制与深度可分离卷积相结合,在增强有效特征权重的基础上,降低模型的参数量和复杂度,实现网络的轻量化。在网络输出阶段,采用先筛选再混洗的方式提高模型的识别精度。利用全局平均池化和全局随机池化提取中间特征图的信息,其中全局平均池化可以较好地保留背景信息,全局随机池化按概率值选取特征,具有较强的泛化性,两者相结合能够减少信息的丢失。在CIFAR-10、CIFAR-100等数据集上的实验结果表明,与MobileNetV2等轻量化神经网络相比,ConcatNet网络在保持Top-1和Top-5精度相当的情况下,将参数量和计算复杂度均降低了约50%,极大降低了对承载设备的要求。

关键词: 轻量化, 通道注意力, 深度可分离卷积, 通道混洗, 特征拼接

Abstract: Traditional image target recognition models typically use neural networks with complex structures and deeper layers to improve their accuracy in the computer vision field.However, such models have high computing power requirements and large memory occupations, and are undeployable on small computers, such as in mobile phones.This study proposes ConcatNet, which is a lightweight Convolutional Neural Network (CNN) that adopts a manner of feature stitching and combines a channel attention mechanism with depth-separable convolution using parallel multi-branching.Enhancing the effective feature weight, ConcatNet reduces the parameter quantity and complexity of the model and realizes a lighter network.In addition, in the network output stage, the method of first screening and then shuffling is adopted to improve the recognition accuracy of the model.Simultaneously, the information of the intermediate feature map is extracted using global average and global random pooling.Global average pooling better retains background information, whereas global random pooling selects features according to the probability value, which has strong generalization.Combining the two can reduce information loss.The experimental results on CIFAR-10 and CIFAR-100 datasets show that, compared with lightweight neural networks such as MobileNetV2, the results show that the network reduces the amount of parameters and computational complexity by approximately 50%, maintains the same accuracy of Top-1 and Top-5, and greatly reduces the requirements for load-bearing equipment.

Key words: lightweight, channel attention, depth separable convolution, channel shuffle, feature stitching

中图分类号: