作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (3): 253-262. doi: 10.19678/j.issn.1000-3428.0060600

• 图形图像处理 • 上一篇    下一篇

基于多重规则和路径评价的在线中英文手写识别方法

付鹏斌, 刘鹏辉, 杨惠荣, 董澳静   

  1. 北京工业大学 信息学部, 北京 100124
  • 收稿日期:2021-01-15 修回日期:2021-02-24 发布日期:2022-03-11
  • 作者简介:付鹏斌(1967-),男,副教授,主研方向为图像处理、模式识别;刘鹏辉,硕士研究生;杨惠荣(通信作者),博士;董澳静,硕士研究生。
  • 基金资助:
    国家自然科学基金(61772048);北京市自然科学基金(4153058)。

Online Chinese and English Handwriting Recognition Method Based on Multiple Rules and Path Evaluation

FU Pengbin, LIU Penghui, YANG Huirong, DONG Aojing   

  1. Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
  • Received:2021-01-15 Revised:2021-02-24 Published:2022-03-11

摘要: 手写文本识别方法主要应用于文本输入技术,对人机交互领域的发展起关键作用。针对多数在线输入法无法识别中英文混合手写识别的问题,提出一种在线中英文混合手写文本识别方法。通过对文本笔画进行基于水平相对位置、垂直重叠率、面积重叠率规则的整合以及连笔切分,得到一系列字符片段,同时利用笔画个数、宽高比、中心偏离、平滑度等几何特征和识别置信度,对字符片段进行中英文分类。在此基础上,根据分类结果并结合自然语言模型的路径评价及动态规划搜索算法,分别对候选的中、英文字符片段进行合并处理,得到待识别的中、英文字符序列,并将其分别送入卷积神经网络的中、英文识别模型中,得到手写文本识别结果。实验结果表明,在线手写中英文混合文本识别正确率达93.67%,不仅能切分在线手写中文文本行,而且对包含字符连笔的在线手写中英文文本行也有较好的切分效果。

关键词: 在线手写识别, 中英文混合手写, 中英文分类, 文本行切分, 路径评价

Abstract: Handwritten text recognition is mainly used in text input technology, which plays a key role in the development of human-computer interaction.To address the lack of functionality for Chinese and English mixed handwritten text recognition in most online input methods, an online Chinese and English mixed handwritten text recognition method is proposed.Through the integration of text strokes based on the horizontal relative position, vertical overlap rate, area overlap rate rules, and continuous stroke segmentation, a series of character segments are obtained.In addition, Chinese and English character segments are classified based on the number of strokes, aspect ratio, center deviation, smoothness, and recognition confidence.On this basis, according to the classification results, combined with the path evaluation of the natural-language model and dynamic programming search algorithm, the candidate and English character segments are combined to obtain the Chinese and English character sequences to be recognized, which are, respectively, sent to the Chinese and English recognition models of the Convolutional Neural Network (CNN) to obtain the handwritten text recognition results.The experimental results show that and the recognition accuracy of the online handwritten Chinese and English mixed text is 93.67%, the proposed method can segment online handwritten Chinese text lines as well as online handwritten Chinese and English text lines containing characters.

Key words: online handwriting recognition, mixed Chinese and English handwriting, Chinese and English classification, text line segmentation, path evaluation

中图分类号: