Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2021, Vol. 47 ›› Issue (10): 236-241. doi: 10.19678/j.issn.1000-3428.0059168

• Graphics and Image Processing • Previous Articles     Next Articles

Research and Application of Lightweight Object Detection Algorithm

HUANG Jingsong1,2, ZUO Haorui1, ZHANG Jianlin1   

  1. 1. Institute of Optics and Electronics, Chinese Academy of Sciences, Chengdu 610209, China;
    2. College of Computer Science, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2020-08-04 Revised:2020-11-11 Published:2020-11-11

轻量化目标检测算法研究及应用

黄靖淞1,2, 左颢睿1, 张建林1   

  1. 1. 中国科学院光电技术研究所, 成都 610209;
    2. 中国科学院大学 计算机学院, 北京 100049
  • 作者简介:黄靖淞(1995-),男,硕士研究生,主研方向为图形图像处理、人工智能、嵌入式技术;左颢睿,副研究员、博士;张建林,研究员、博士、博士生导师。
  • 基金资助:
    科技委创新项目(G158207)。

Abstract: The existing target detection algorithms based on convolutional neural networks have achieved a high accuracy, but the accuracy gain comes at the cost of detection speed, making it difficult for the algorithms to implement real-time detection with limited computing power.To solve this problem, a series of lightweight methods are adopted based on the YOLO target detection algorithm.The methods employ Mobilenetv1 to replace the basic network of Darknet53, and depthwise separable convolutions to replace the 3×3 standard convolutions in the YOLO head part.On this basis, the convolution layer filter is sorted and pruned according to sensitivity.Finally, C++ inference algorithms are deployed on the embedded GPU TX2 platform.The test results on the VOC data set show that the improved algorithm provides an acceleration of 2.4 times while the accuracy is reduced by only 0.75 percentage points.Additionally, the memory occupied by the improved model is only 21.5% of that occupied by the original model.

Key words: object detection, lightweight, depthwise separable convolution, pruning, embedded GPU, C++ inferred deployment

摘要: 基于卷积神经网络的目标检测算法在追求较高精度的同时,忽略了检测速度,使得算法难以在有限算力的情况下实现实时检测。在YOLO目标检测算法的基础上,采用一系列轻量化的方法,运用Mobilenetv1网络替换Darknet53基础网络,将YOLO head部分3×3标准卷积替换为深度可分离卷积,根据灵敏度对卷积层滤波器进行排序和修剪,并在嵌入式GPU TX2平台上进行C++推理部署。在VOC数据集上的测试结果表明,改进算法在精度仅下降0.75个百分点的前提下实现了2.4倍加速,模型占用内存仅为原来的21.5%。

关键词: 目标检测, 轻量化, 深度可分离卷积, 剪枝, 嵌入式GPU, C++推理部署

CLC Number: