基于NCS2神经计算棒的车辆检测方法

引用本文

江枭宇, 李忠兵, 张军豪, 等. 基于NCS2神经计算棒的车辆检测方法[J]. 计算机工程, 2021, 47(3), 298-303. DOI: 10.19678/j.issn.1000-3428.0056214.

JIANG Xiaoyu, LI Zhongbing, ZHANG Junhao, et al. Vehicle Detection Method Based on NCS2 Neural Computing Stick[J]. Computer Engineering, 2021, 47(3), 298-303. DOI: 10.19678/j.issn.1000-3428.0056214.

基金项目

教育部产学合作协同育人项目（201801006095）；四川省大学生创新创业训练计划项目（201810615094）

作者简介

江枭宇(1996-), 男, 本科生, 主研方向为计算机视觉;
李忠兵, 讲师、博士;
张军豪, 硕士研究生;
彭娇, 硕士研究生;
文婷, 本科生

文章历史

收稿日期：2019-10-08
修回日期：2020-02-13

Contents Abstract Full text Figures/Tables PDF

基于NCS2神经计算棒的车辆检测方法

江枭宇 , 李忠兵 , 张军豪 , 彭娇 , 文婷

西南石油大学电气信息学院, 成都 610500

收稿日期：2019-10-08；修回日期：2020-02-13

基金项目：教育部产学合作协同育人项目（201801006095）；四川省大学生创新创业训练计划项目（201810615094）

作者简介：江枭宇(1996-), 男, 本科生, 主研方向为计算机视觉; 李忠兵, 讲师、博士; 张军豪, 硕士研究生; 彭娇, 硕士研究生; 文婷, 本科生.

E-mail: ceroo1005@gmail.com

摘要：基于深度学习的车辆检测方法准确率较高，其在性能卓越的计算机与图形处理器设备上实时性较好，但在性能相对较低的嵌入式设备上实时性较差。在改进Tiny-YOLO网络的基础上，提出一种利用NCS2神经计算棒的嵌入式车辆检测方法。采用深度可分离卷积替换Tiny-YOLO网络标准卷积降低计算量，去除池化层并使用全卷积层以保留低级特征信息，采用Tensorflow深度学习框架训练改进的Tiny-YOLO网络，并将其部署到配备NCS2神经计算棒的嵌入式设备上。实验结果表明，与原始Tiny-YOLO网络相比，改进Tiny-YOLO网络检测实时性提高1倍，在MS COCO和VOC2007数据集上平均检测准确率分别提升1.12和0.23个百分点，配备NCS2神经计算棒后该方法检测的每秒传输帧数达到12，实时性较原始Tiny-YOLO网络大幅提高。

Vehicle Detection Method Based on NCS2 Neural Computing Stick

JIANG Xiaoyu , LI Zhongbing , ZHANG Junhao , PENG Jiao , WEN Ting

School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu 610500, China

Abstract: The vehicle detection methods based on deep learning have high accuracy and excellent real-time performance for high-end computers and graphics processors, but their real-time performance is reduced for relatively low-end embedded devices. Based on the improved Tiny-YOLO network, this paper proposes a vehicle detection method using NCS2 neural computing stick for embedded devices. The Depthwise Separable Convolution (DSC) is used to replace the standard convolution of Tiny-YOLO network to reduce the amount of computation. The pooling layer is removed and the full convolution layer is used to retain the low-level feature information. The Tensorflow deep learning framework is used to train the improved Tiny-YOLO network and deploy it to the embedded device with the NCS2 neural computing stick. Experimental results show that compared with the original Tiny-YOLO network, the improved Tiny-YOLO network doubles the real-time performance, and increases the average detection accuracy by 1.12 and 0.23 percentage points respectively on MS COCO and VOC2007 datasets. After equipped with NCS2 neural computing stick, the number of frames per second detected by the proposed method reaches 12, which greatly improves the real-time performance compared with the original Tiny-YOLO network.

0 概述

环境感知是无人驾驶汽车路径规划的基础，无人驾驶系统主要通过摄像头采集、实时检测和获取周边车辆信息，对周边环境形成认知模型，从而实现对环境的感知。在车辆检测方面，传统车辆检测方法主要通过梯度直方图（Histogram of Oriented Gradient，HOG）^[1]与尺度不变特征变换（Scale-Invariant Feature Transform，SIFT）^[2]提取特征信息，并使用支持向量机（Support Vector Machine，SVM）^[3]、Adaboost^[4]或Gradient Boosting^[5]等自适应分类器进行识别，但该方法无法获取高层语义信息。近年来，深度学习技术在车辆检测领域得到广泛应用。与传统车辆检测方法不同，基于深度学习的车辆检测方法不是人为设置特征，而是通过反向传播^[6]自适应获取特征，具有较好的高层语义信息描述能力，该方法主要包括One Stage方法和Two Stage方法。Two Stage方法是通过预选框^[7-8]确定位置，针对该位置进行识别，例如R-CNN^[9]方法、Fast R-CNN^[10]方法和Faster R-CNN^[7]方法等。One Stage方法是端到端一次性识别出位置与类别，例如YOLO^[11-12]方法、SSD^[13]方法等。基于深度学习的车辆检测方法具有较好的检测精度，但是由于网络模型复杂、参数量大以及计算周期长导致检测实时性较差，因此其无法应用于实际车辆检测。

为解决上述问题，文献[14]对YOLO网络进行精简后提出Tiny-Yolo网络，但是该网络部署在嵌入式设备上参数多，且计算时间较长。对此，文献[15]采用神经计算棒进行加速计算。本文受上述文献的启发，使用深度可分离卷积（Depthwise Separable Convolution，DSC）^[16]替换传统车辆检测算法中Tiny-YOLO网络的标准卷积，将改进的Tiny-YOLO网络部署到配备NCS2神经计算棒的嵌入式设备上，并对目标检测准确率与实时性进行对比与分析。

1 YOLOV3原理

在YOLO系列网络中，YOLOV2是在YOLOV1的基础上加入1×1卷积并采用正则化方法防止过拟合，YOLOV3是对YOLOV2的改进，主要包括基础网络Darknet53和全卷积层^[17]，其中Darknet53由包含53个卷积层的残差结构^[18]组成，可降低网络训练难度并提高计算效率。

将输入的416像素×416像素图像经过Darknet53和全卷积层，得到输出的13像素×13像素特征图、26像素×26像素特征图以及52像素×52像素特征图。每个特征图被分为多个网络域，每个网络域输出尺寸为1×1×（B×（5+C）），其中，1×1为最后一层卷积的大小，B为每个网络域可预测的边界框（以下称为预测框）数量。预测框包括5+C个属性，分别为每个预测框$ \mathrm{中}\mathrm{心}\mathrm{点}x\mathrm{轴}\mathrm{坐}\mathrm{标}\mathrm{的}\mathrm{偏}\mathrm{移}\mathrm{值}{t}_{x}\mathrm{、} $中心点y轴坐标的偏移值t_y、中心点宽度的偏移值t_w、中心点高度的偏移值t_h、Objectness分数以及C类置信度。

由于YOLOV3网络训练会造成其梯度不稳定，因此在MS COCO数据集样本中使用K-means聚类算法^[19]生成9个不同尺度的先验框，预测框基于这9个先验框进行微调。设$ {P}_{x}\mathrm{、}{P}_{y} $为特征图中先验框中心点的预测坐标，$ {P}_{w}\mathrm{、}{P}_{h} $分别为特征图中先验框的预测宽度和高度，$ {G}_{x}\mathrm{、}{G}_{y} $为特征图中先验框中心点的真实坐标，$ {G}_{w}\mathrm{、}{G}_{h} $分别为特征图中先验框的真实宽度和高度，其对应偏移值的计算公式如下：

$ {t}_{x}={G}_{x}-{P}_{x} $

(1)

$ {t}_{y}={G}_{y}-{P}_{y} $

(2)

$ {t}_{w}=\mathrm{l}\mathrm{n}\left({G}_{w}/{P}_{w}\right) $

(3)

$ {t}_{h}=\mathrm{l}\mathrm{n}\left({G}_{h}/{P}_{h}\right) $

(4)

先验框高度与宽度的偏移值由真实值与预测值相除后缩放到对数空间得到。先验框预测值和真实值之间的偏移值可用于修正先验框和预测框的偏移关系，如图 1所示。

	Download: JPG larger image
图 1 先验框和预测框的偏移关系 Fig. 1 The offset relationship between prior box and prediction box

图 1中A点为预测框中心点，B点为预测中心点，其所在网络域的坐标为$ \left({C}_{x}，{C}_{y}\right) $，该坐标由$ {P}_{x} $和$ {P}_{y} $确定。预测框$ \mathrm{中}\mathrm{心}\mathrm{点}\mathrm{的}\mathrm{坐}\mathrm{标}\mathrm{值}{b}_{x}\mathrm{、} $b_y，以及中心点宽度b_w、中心点高度b_h由$ {t}_{x}\mathrm{、}{t}_{y}\mathrm{、}{t}_{w}\mathrm{和}{t}_{h} $计算得到，相关公式如下：

$ {b}_{x}=\sigma \left({t}_{x}\right)+{C}_{x} $

(5)

$ {b}_{y}=\sigma \left({t}_{y}\right)+{C}_{y} $

(6)

$ {b}_{w}={P}_{w}{\mathrm{e}}^{{t}_{w}} $

(7)

$ {b}_{h}={P}_{h}{\mathrm{e}}^{{t}_{h}} $

(8)

其中，$ \sigma $为Sigmoid函数。网络域尺寸为1×1，使用Sigmoid函数将$ {t}_{x}\mathrm{和}{t}_{y} $缩放到0~1范围内，可有效确保目标中心处于网络域中，防止其过度偏移。由于$ {t}_{w}\mathrm{和}{t}_{h} $使用了对数空间，因此将其通过指数计算得到$ {G}_{w}/{P}_{w} $或者$ {G}_{h}/{P}_{h} $后再乘以真实的$ {P}_{w} $或$ {P}_{h} $可得到真实的宽度与高度。

Objectness分数的计算公式如下：

$ {C}_{\mathrm{O}\mathrm{b}\mathrm{j}\mathrm{e}\mathrm{c}\mathrm{t}}={P}_{\mathrm{O}\mathrm{b}\mathrm{j}\mathrm{e}\mathrm{c}\mathrm{t}}\times \mathrm{I}\mathrm{O}{\mathrm{U}}_{\mathrm{p}\mathrm{r}\mathrm{e}\mathrm{d}}^{\mathrm{t}\mathrm{r}\mathrm{u}\mathrm{t}\mathrm{h}} $

(9)

其中：$ {C}_{\mathrm{O}\mathrm{b}\mathrm{j}\mathrm{e}\mathrm{c}\mathrm{t}} $为网络域中含有车辆的自信度；$ {P}_{\mathrm{O}\mathrm{b}\mathrm{j}\mathrm{e}\mathrm{c}\mathrm{t}} $为目标是否存在的标记值，当存在目标时，P_Object=1，否则P_Object=0；IOU为预测框和原标记框的面积交并比。

在原始YOLOV3网络中，当B=3且C=80时，表示一个网络域需要预测3个边界框且有80个类别，通过设计多个先验框可提高先验框预测尺寸匹配的概率。

2 改进的Tiny-YOLO

Tiny-YOLO网络删除了原始YOLO网络中加深网络的残差结构，在节省内存的同时加快了计算速度，且输出的特征图中网络域个数只有13×13和26×26两个尺寸，该网络中的标准卷积会增大计算量，而MobileNet^[20]的深度可分离卷积可大幅减少计算量。

2.1 深度可分离卷积

深度可分离卷积将标准卷积分解为深度卷积和逐点卷积，可减少计算复杂度，适用于嵌入式设备，其分解过程如图 2所示。

	Download: JPG larger image
图 2 深度可分离卷积分解过程 Fig. 2 Decomposition procedure of depthwise separable convolution

卷积计算时的输入$ F\in {\mathbb{R}}^{{D}_{\mathrm{f}}\times {D}_{\mathrm{f}}\times M} $，$ {D}_{\mathrm{f}}\times {D}_{\mathrm{f}} $为输入特征图的大小，M为输入特征图的通道数；卷积$ K\in {\mathbb{R}}^{{D}_{\mathrm{k}}\times {D}_{\mathrm{k}}\times M\times N} $，$ {D}_{\mathrm{k}}\times {D}_{\mathrm{k}} $为卷积尺寸，M和N分别为卷积的通道数和个数；输出$ G\in {\mathbb{R}}^{{D}_{\mathrm{g}}\times {D}_{\mathrm{g}}\times N} $，$ {D}_{\mathrm{g}}\times {D}_{\mathrm{g}} $为输出特征图的大小，N为输出特征图的通道数。标准卷积的计算公式如下：

$ {G}_{\mathrm{k}, l, n}=\sum\limits _{i, j, m}{K}_{i, j, m, n}{F}_{\mathrm{k}+i-1, l+j-1, m} $

(10)

标准卷积的计算量为：

$ {O}_{\mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}}={D}_{\mathrm{k}}{D}_{\mathrm{k}}MN{D}_{\mathrm{f}}{D}_{\mathrm{f}} $

(11)

深度卷积是在输入特征图的每个通道上应用单个滤波器进行滤波，其输入$ F\in {\mathbb{R}}^{{D}_{\mathrm{f}}\times {D}_{\mathrm{f}}\times M} $，卷积$ \widehat{K}\in {\mathbb{R}}^{{D}_{\mathrm{k}}\times {D}_{\mathrm{k}}\times 1\times M} $，计算公式如下：

$ {\widehat{G}}_{\mathrm{k}, l, m}=\sum\limits _{i, j}{\widehat{K}}_{i, j, m}{F}_{\mathrm{k}+i-1, l+j-1, m} $

(12)

深度卷积的计算量为：

$ {O\text{'}}_{\mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}}={D}_{\mathrm{k}}{D}_{\mathrm{k}}M{D}_{\mathrm{f}}{D}_{\mathrm{f}} $

(13)

与标准卷积相比，深度卷积能有效进行维度变换，其除了过滤输入通道，还可组合创建新功能。因此，通过1×1卷积创建线性组合生成新特征，深度可分离卷积计算量为：

$ {{O}^{\text{'}}}_{\mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}}={D}_{\mathrm{k}}{D}_{\mathrm{k}}M{D}_{\mathrm{f}}{D}_{\mathrm{f}}+MN{D}_{\mathrm{f}}{D}_{\mathrm{f}} $

(14)

深度可分离卷积计算量与标准卷积计算量比值为：

$ \frac{{{O}^{\text{'}}}_{\mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}}}{{O}_{\mathrm{c}\mathrm{o}\mathrm{n}\mathrm{v}}}=\frac{{D}_{\mathrm{k}}{D}_{\mathrm{k}}M{D}_{\mathrm{f}}{D}_{\mathrm{f}}+MN{D}_{\mathrm{f}}{D}_{\mathrm{f}}}{{D}_{\mathrm{k}}{D}_{\mathrm{k}}MN{D}_{\mathrm{f}}{D}_{\mathrm{f}}}=\frac{1}{N}+\frac{1}{{D}_{\mathrm{k}}^{2}} $

(15)

由式（15）可见，当N＞1且D_k不变时，深度可分离卷积计算量较标准卷积计算量明显降低。

2.2 改进的Tiny-YOLO网络结构

为进一步降低Tiny-YOLO网络的计算量，本文引入3×3深度可分离卷积（S Conv）代替原始Tiny-YOLO网络中9个3×3标准卷积（Conv），改进前后的Tiny-YOLO网络结构及3×3标准卷积的具体信息分别如图 3和表 1所示。为防止池化操作导致低级特征信息丢失，本文删除原始Tiny-YOLO网络中所有的池化层（Maxpool），并采用全卷积层进行连接。

	Download: JPG larger image
图 3 改进前后的Tiny-YOLO网络结构 Fig. 3 Structure of the Tiny-YOLO network before and after improvement

下载CSV 表 1 原始Tiny-YOLO网络中3×3标准卷积信息 Table 1 3 × 3 standard convolution information in original Tiny-YOLO network

3 实验与结果分析

本文采用MS COCO数据集作为实验数据集，选取车辆尺度不同且角度随机的821张图像，图像高度大于400像素。将MS COCO数据集中的700张图像作为训练集，将MS COCO数据集中121张图像和VOC2007数据集中895张图像作为测试集。

本文实验采用Ubuntu 16.04操作系统和Tensorflow深度学习框架，下位机设备为树莓派Raspberry 3b+与NCS2神经计算棒，上位机设备为E5 2680 + GTX1066。

3.1 NCS2神经计算棒的部署

本文在改进Tiny-YOLO网络的基础上部署NCS2神经计算棒对网络性能进一步优化，具体流程如图 4所示。改进的Tiny-YOLO网络的训练和测试图像的输入尺寸与原始Tiny-YOLO网络一致，通过深度可分离卷积网络提取车辆特征信息，再采用两个不同尺寸的特征图进行预测。在上位机设备上采用Tensorflow深度学习框架对改进的Tiny-YOLO网络进行训练，获得Tensorflow模型后，用Open VINO模型优化器将其转换为NCS2神经计算棒支持的IR文件，并部署到具有NCS2神经计算棒的树莓派Raspberry 3b+上。

	Download: JPG larger image
图 4 NCS2神经计算棒的部署流程 Fig. 4 Deployment procedure of NCS2 neural computing stick

3.2 网络改进前后的计算量对比

在神经网络的计算中，由于乘法计算次数远大于加法计算次数，而一次乘法的计算时间远大于一次加法的计算时间，因此加法的总计算时间可忽略。本文将一次乘法计算记为一次计算量，则原始Tiny-YOLO网络中9层3×3标准卷积在替换为深度可分离卷积前后各卷积层计算量与总计算量的对比情况如图 5所示。可以看出，标准卷积被深度可分离卷积替换后，各卷积层计算量与总计算量均大幅降低，且总计算量从2.74×10⁹减少到0.39×10⁹，计算量降幅约为86%。

	Download: JPG larger image
图 5 卷积替换前后各卷积层计算量与总计算量的对比 Fig. 5 Comparison of the calculation amount of each convolution layer and the total calculation amount before and after convolution replacement

3.3 网络改进前后准确率及实时性对比

本文使用平均准确率（Mean Average Precision，MAP）对原始Tiny-YOLO网络、改进Tiny-YOLO网络以及NCS2神经计算棒部署下改进Tiny-YOLO网络的检测准确率进行评价，并以每秒传输帧数（Frames Per Second，FPS）作为检测实时性的评价指标。MAP的计算公式为：

$ \mathrm{M}\mathrm{A}\mathrm{P}={\int }_{0}^{1}P\left(R\right)\mathrm{d}R $

(16)

其中，P为准确率，R为召回率，P（R）为不同召回率上的平均准确率。

表 2为采用改进Tiny-YOLO网络的方法（以下称为改进Tiny-YOLO）、原始Tiny-YOLO网络的方法（以下称为原始Tiny-YOLO）以及NCS2神经计算棒部署下改进Tiny-YOLO网络的方法（以下称为改进Tiny-YOLO+NCS2）得到的实验结果。可以看出：改进Tiny-YOLO在MS COCO数据集和VOC2007数据集上的MAP值比原始Tiny-YOLO分别提高0.011 2和0.002 3，改进Tiny-YOLO的FPS值为原始Tiny-YOLO的2倍；改进Tiny-YOLO+NCS2的MAP值略低于其他两种方法，但其FPS值达到12，远高于其他两种方法。由上述结果可知，改进Tiny-YOLO+NCS2在牺牲少许检测精度的情况下，其实时性较其他两种方法大幅提高，更适合部署在无人驾驶系统中。

下载CSV 表 2 3种算法的实验结果 Table 2 Experimental results of three algorithms

3.4 不同场景的效果对比

将改进Tiny-YOLO与原始Tiny-YOLO在VOC2007数据集上的检测效果进行对比，结果如图 6所示，其中每组左、右两侧图像分别由原始Tiny-YOLO和改进Tiny-YOLO检测得到。可以看出：当车辆尺寸不同时，原始Tiny-YOLO较改进Tiny-YOLO更易丢失小目标信息；当车辆被遮挡时，原始Tiny-YOLO无法获取被遮挡的车辆信息，改进Tiny-YOLO可准确检测到被遮挡的车辆信息；在恶劣环境与夜间环境下，原始Tiny-YOLO较改进Tiny-YOLO易受环境和光线干扰。上述结果表明，改进Tiny-YOLO的车辆检测效果要优于原始Tiny-YOLO。

	Download: JPG larger image
图 6 2种方法在不同场景下的检测效果对比 Fig. 6 Comparison of detection effect of two methods in different scenes

4 结束语

本文提出一种结合改进Tiny-YOLO网络与NCS2神经计算棒的车辆检测方法。采用深度可分离卷积代替原始Tiny-YOLO网络标准卷积，使用NCS2神经计算棒为低性能嵌入式设备提供深度学习加速功能。实验结果表明，采用该方法检测每秒传输帧数达到12，实时性较原始Tiny-YOLO网络大幅提高。后续将对改进Tiny-YOLO网络进行量化压缩提高计算速度，以应用于STM32等常用嵌入式设备。

参考文献

[1]	DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2005: 886-893.
[2]	LOWE D G. Object recognition from local scale-invariant features[C]//Proceedings of the 7th IEEE International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 1999: 1150-1157.
[3]	SÁNCHEZ A V. Advanced support vector machines and kernel methods[J]. Neurocomputing, 2003, 55(1): 5-20.
[4]	FERREIRA A J, FIGUEIREDO M A T. Boosting algorithms: a review of methods, theory, and applications[J]. Ensemble Machine Learning, 2012, 19(1): 35-85.
[5]	FRIEDMAN J H. Greedy function approximation: a gradient boosting machine[J]. The Annals of Statistics, 2000, 29(5): 31-39.
[6]	RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors[J]. Nature, 1986, 323(6088): 533-536. DOI:10.1038/323533a0
[7]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI:10.1109/TPAMI.2016.2577031
[8]	KONG Tao, YAO Anbang, CHEN Yurong, et al. HyperNet: towards accurate region proposal generation and joint object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2016: 845-853.
[9]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2014: 21-28.
[10]	GIRSHICK R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 2015: 32-37.
[11]	REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA: IEEE Press, 2016: 63-69.
[12]	REDMON J, FARHADI A.YOLO9000: better, faster, stronger[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA: IEEE Press, 2017: 51-56.
[13]	LIU W, ANGUELOV D, ERHAN D, et al.SSD: single shot multibox detector[C]//Proceedings of ECCV'16. Berlin, Germany: Springer, 2016: 21-37.
[14]	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL].[2019-08-23].https://www.research-gate.net/publication/324387691_YOLOv3_An_Incremental_Improvement.
[15]	ZHANG Yangshuo, MIAO Zhuang, WANG Jiabao, et al. Pedestrian detection method based on Movidius neural computing stick[J]. Journal of Computer Applications, 2019, 39(8): 2230-2234. (in Chinese) 张洋硕, 苗壮, 王家宝, 等. 基于Movidius神经计算棒的行人检测方法[J]. 计算机应用, 2019, 39(8): 2230-2234.
[16]	CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA: IEEE Press, 2017: 1800-1807.
[17]	LONG J, SHELHAMER E, DARREL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA: IEEE Press, 2015: 640-651.
[18]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA: IEEE Press, 2016: 521-528.
[19]	WAGSTAFF K, CARDIE C, ROGERS S, et al. Constrained K-means clustering with background knowledge[C]//Proceedings of the 18th International Conference on Machine Learning. New York, USA: ACM Press, 2001: 577-584.
[20]	HOWARD A G, ZHU Menglong, CHEN Bo, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL].[2019-08-23].https://www.researchgate.net/publication/316184205_MobileNets_Efficient_Convolutional_Neural_Networks_for_Mobile_Vision_Applications.