基于深度学习的工业视觉箱体字符识别与判断

doi:10.19678/j.issn.1000-3428.0059680

计算机工程 ›› 2022, Vol. 48 ›› Issue (1): 296-304. doi: 10.19678/j.issn.1000-3428.0059680

基于深度学习的工业视觉箱体字符识别与判断

葛永杰^1,3, 王丽丹^1,2,3,4, 陈定喜⁵, 段书凯^1,2,3,4,6, 干秀灵^1,3

1. 西南大学电子信息工程学院, 重庆 400715;
2. 智能传动和控制技术国家地方联合工程实验室, 重庆 400715;
3. 类脑计算与智能控制重庆市重点实验室, 重庆 400715;
4. 重庆市脑科学协同创新中心, 重庆 400715;
5. 美的集团, 广东佛山 528311;
6. 西南大学人工智能学院, 重庆 400715

收稿日期:2020-10-10 修回日期:2020-12-24 发布日期:2021-01-21
作者简介:葛永杰(1993-),男,硕士研究生,主研方向为机器学习、计算机视觉;王丽丹(通信作者),教授、博士、博士生导师;陈定喜,硕士;段书凯,教授、博士、博士生导师;干秀灵,硕士研究生。
基金资助:
国家重点研发计划（2018YFB1306600）；国家自然科学基金（62076207，62076208，U20A20227，61672436）；重庆市基础科学与前沿技术研究专项重点项目（cstc2017jcyjBX0050）。

Character Recognition and Judgment of Industrial Vision Box Based on Deep Learning

GE Yongjie^1,3, WANG Lidan^1,2,3,4, CHEN Dingxi⁵, DUAN Shukai^1,2,3,4,6, GAN Xiuling^1,3

1. College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China;
2. National and Local Joint Engineering Laboratory of Intelligent Transmission and Control Technology, Chongqing 400715, China;
3. Chongqing Key Laboratory of Brain-Inspired Computing and Intelligent Control, Chongqing 400715, China;
4. Chongqing Brain Science Collaborative Innovation Center, Chongqing 400715, China;
5. Midea Group, Foshan, Guangdong 528311, China;
6. School of Artificial Intelligence, Southwest University, Chongqing 400715, China

Received:2020-10-10 Revised:2020-12-24 Published:2021-01-21

摘要/Abstract

摘要： 工厂生产线上的商品包装外箱文本印刷存在残缺，无法及时检出会影响流通销售。制作工业商品外观信息数据集，提出基于深度学习的工业视觉箱体字符识别与匹配判断方法。合并YOLOv3中的卷积层和批量归一化层，引入GIoU作为边界框损失函数并设计自适应调整定位坐标的方法，优化在原始图像上进行文本检测定位的速度与精度。同时，训练并对比CRNN和Tesseract两种识别引擎在已裁剪文本图片上的识别性能，设计字符匹配方法判断字符识别正确与否并输出结果，从而减少误判。对基于该方法的系统进行生产线实测，实验结果表明，其识别准确率可达99.5%，单件商品的外观拍照、检测识别、输出结果耗时仅3 s左右，表明所提方法能够实现实时监测。

关键词: 深度学习, YOLOv3算法, 卷积递归神经网络, 字符识别, 外观信息, 实时监测

Abstract: If the incomplete text printing on commodity packaging boxes produced by factory production lines cannot be detected in time, the sales and circulation of the commodities will be affected.This paper presents a deep learning-based box character recognition and matching method for industrial vision, and also makes a data set of industrial commodity appearance information for the method.By merging the convolutional layer and the batch normalization layer of YOLOv3, and introducing GIoU as the loss function of the boundary box, a method for adaptive positioning coordinate adjustment is designed, which improves the speed and accuracy of text detection and location on the original image.Then the recognition performance of the trained CRNN and Tesseract engines on cropped text images is compared.The designed character matching method is used to judge whether the character recognition result is correct, and the result is output, which reduces the misjudgment.The system based on this method is tested on a production line, and the experimental results show that the system displays an accuracy of 99.5%.It takes about 3 s to take a photo of the appearance, detect and recognize the characters, and output the result of a single product, which demonstrates that the proposed method enables real-time monitoring.

Key words: deep learning, YOLOv3 algorithm, Convolutional Recurrent Neural Network(CRNN), character recognition, appearance information, real-time monitoring

中图分类号:

TP18

葛永杰, 王丽丹, 陈定喜, 段书凯, 干秀灵. 基于深度学习的工业视觉箱体字符识别与判断[J]. 计算机工程, 2022, 48(1): 296-304.

GE Yongjie, WANG Lidan, CHEN Dingxi, DUAN Shukai, GAN Xiuling. Character Recognition and Judgment of Industrial Vision Box Based on Deep Learning[J]. Computer Engineering, 2022, 48(1): 296-304.

https://www.ecice06.com/CN/Y2022/V48/I1/296

图/表 17

20220108124106

20220108124109

20220108124113

20220108124117

20220108124120

20220108124123

20220108124127

20220108124131

20220108124135

20220108124139

20220108124143

20220108124150

20220108124154

20220108124157

20220108124202

20220108124207

20220108124211

参考文献

[1] LONG S, HE X, YAO C.Scene text detection and recognition:the deep learning era[J].International Journal of Computer Vision, 2021, 129:161-184.
[2] TIAN Z, HUANG W, HE T, et al.Detecting text in natural image with connectionist text proposal network[C]//Proceedings of 2016 IEEE European Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2016:56-72.
[3] SHI B, BAI X, BELONGIE S.Detecting oriented text in natural images by linking segments[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:3482-3490.
[4] ZHOU X, YAO C, WEN H, et al.EAST:an efficient and accurate scene text detector[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:2642-2651.
[5] REN S, HE K, GIRSHICK R, et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149.
[6] GIRSHICK R.Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2015:1440-1448.
[7] REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once:unified, real-time object detection[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2015:779-788.
[8] REDMON J, FARHADI A.YOLO9000:better, faster, stronger[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:6517-6525.
[9] REDMON J, FARHADI A.YOLOv3:an incremental improvement[EB/OL].(2018-04-08)[2020-10-05].https://arxiv.org/pdf/1804.02767.pdf.
[10] HOCHREITER S, SCHMIDHUBER J.Long short-term memory[J].Neural Computation, 1997, 9(8):1735-1780.
[11] GRAVES A, FERNANDEZ S, GOMEZ F, et al.Connectionist temporal classification:labelling unsegmented sequence data with recurrent neural networks[C]//Proceedings of the 23rd International Conference on Machine Learning.Pittsburgh, USA:[s.n.], 2006:369-376.
[12] LIU X D, LIANG D, YAN S.FOTS:fast oriented text spotting with a unified network[C]//Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:5676-5685.
[13] 陈玄, 朱荣, 王中元.基于融合卷积神经网络模型的手写数字识别[J].计算机工程, 2017, 43(11):187-192. CHEN X, ZHU R, WANG Z Y.Handwritten digit recognition based on fused convolution neural network model[J].Computer Engineering, 2017, 43(11):187-192.(in Chinese)
[14] ZHAO S, SUN L, LI G, et al.A CCD based machine vision system for real-time text detection[J].Frontiers of Optoelectronics, 2019(7):1-7.
[15] 史建伟, 章韵.基于改进YOLOv3和BGRU的车牌识别系统[J].计算机工程与设计, 2020, 41(8):2345-2351.(in Chinese) SHI J W, ZHANG Y.License plate recognition system based on improved YOLOv3 and BGRU[J].Computer Engineering and Design, 2020, 41(8):2345-2351.(in Chinese)
[16] LI D Y, TANG Q, ZHOU H, et al.Character recognition for automotive parts coding based on convolutional neural network[J].Journal of Physics:Conference Series, 2020, 1518:1-9.
[17] 郭晓峰, 王耀南, 毛建旭.基于几何特征的IC芯片字符分割与识别方法[J].智能系统学报, 2020, 15(1):144-151. GUO X F, WANG Y N, MAO J X.IC chip character segmentation and recognition method based on geometric features[J].CAAI Transactions on Intelligent Systems, 2020, 15(1):144-151.(in Chinese)
[18] 程淑红, 周斌.基于改进CNN的铝轮毂背腔字符识别[J].计算机工程, 2019, 45(5):182-186. CHENG S H, ZHOU B.Character recognition of aluminum wheel hub back cavity based on improved CNN[J].Computer Engineering, 2019, 45(5):182-186.(in Chinese)
[19] 何鎏一, 杨国为.基于深度学习的光照不均匀文本图像的识别系统[J].计算机应用与软件, 2020, 37(6):184-190, 217. HE L Y, YANG G W.Recognition system of uneven illumination text image based on deep learning[J].Computer Applications and Software, 2020, 37(6):184-190, 217.(in Chinese)
[20] REZATOFIGHI H, TSOI N, GWAK J Y, et al.Generalized intersection over union:a metric and a loss for bounding box regression[C]//Proceedings of 2020 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:658-666.
[21] YU J, JIANG Y, WANG Z, et al.UnitBox:an advanced object detection network[C]//Proceedings of the 24th ACM International Conference on Multimedia.New York, USA:ACM Press, 2016:1-5.
[22] SPORICI D, BOIANGIU C A.Improving the accuracy of Tesseract4.0 OCR engine using convolution-based preprocessing[J].Symmetry, 2020, 12(5):715.
[23] SHI B, BAI X, YAO C.An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(11):2298-2304.

选择文件类型/文献管理软件名称

选择包含的内容

基于深度学习的工业视觉箱体字符识别与判断

Character Recognition and Judgment of Industrial Vision Box Based on Deep Learning

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 17

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	魏嵬, 丁香香, 郭梦星, 杨钊, 刘辉. 文本相似度计算方法综述[J]. 计算机工程, 2024, 50(9): 18-32.
[2]	张天鹏, 韩晶, 吕学强. 基于多任务学习的超分辨率辅助小目标检测[J]. 计算机工程, 2024, 50(9): 304-312.
[3]	高煜宝, 文志诚. 基于注意力机制的双路解码器图像去噪方法[J]. 计算机工程, 2024, 50(9): 324-332.
[4]	朱凯, 李理, 张彤, 江晟, 别一鸣. 基于Transformer的多阶段运动模糊图像修复网络[J]. 计算机工程, 2024, 50(9): 276-285.
[5]	张华青, 夏张涛, 陆晓庆, 童基均. 基于字形特征的血管外科命名实体识别[J]. 计算机工程, 2024, 50(8): 13-21.
[6]	张亚洲, 和玉, 戎璐, 王祥凯. 基于上下文知识增强型Transformer网络的抑郁检测[J]. 计算机工程, 2024, 50(8): 75-85.
[7]	高伟, 李帅龙, 茆琳, 王磊, 李颖颖, 韩林. 一种基于TVM的算子生成加速策略[J]. 计算机工程, 2024, 50(8): 353-362.
[8]	王宇, 祁琦, 王纯, 许才. 储能变流器信号高精度故障诊断方法[J]. 计算机工程, 2024, 50(8): 389-396.
[9]	牛瑞婷, 严天峰, 高锐, 王映植. 低信噪比下基于深度学习TCNN-MobileNet的调制识别[J]. 计算机工程, 2024, 50(7): 204-215.
[10]	肖慈, 徐杨, 张永丹, 冯明文, 黄易仟. 结合注意力和低光增强的夜间语义分割[J]. 计算机工程, 2024, 50(7): 271-281.
[11]	张诗婧, 莫绪涛, 赵行, 董杨林. 基于球面折反射成像和YOLOv7的内螺纹缺陷检测[J]. 计算机工程, 2024, 50(7): 282-292.
[12]	李永飞, 李铭洋, 常鑫, 曹可欣. 基于可解释性深度学习的物联网水质监测数据异常检测[J]. 计算机工程, 2024, 50(6): 179-187.
[13]	李致金, 汤佳辉, 闫金凤. 基于边缘计算的轻量化识别方法[J]. 计算机工程, 2024, 50(6): 287-295.
[14]	李雪, 王雅文, 张前进. 基于信息检索的源代码自动命名[J]. 计算机工程, 2024, 50(6): 304-310.
[15]	徐明亮, 李芳媛, 马浩然, 何飞. 大规模神经记录的峰电位聚类算法(特邀)[J]. 计算机工程, 2024, 50(6): 1-34.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于深度学习的工业视觉箱体字符识别与判断

Character Recognition and Judgment of Industrial Vision Box Based on Deep Learning

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 17

参考文献

相关文章 15

编辑推荐

Metrics

本文评价