作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (23): 202-204,. doi: 10.3969/j.issn.1000-3428.2006.23.072

• 人工智能及识别技术 • 上一篇    下一篇

印刷体数学公式结构分析方法的研究

田学东,李 娜,徐丽娟   

  1. (河北大学数学与计算机学院,保定071002)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2006-12-05 发布日期:2006-12-05

Research on Structural Analysis of Mathematical Expressions in Printed Documents

TIAN Xuedong, LI Na, XU Lijuan   

  1. (College of Mathematics and Computer, Hebei University, Baoding 071002)
  • Received:1900-01-01 Revised:1900-01-01 Online:2006-12-05 Published:2006-12-05

摘要: 印刷体数学公式识别是OCR技术的重要组成部分,也是识别技术发展的瓶颈所在。在介绍公式识别技术发展现状的基础上,针对结构分析这一公式识别的关键环节,提出了一种基于基准线和字符间空白域特征的公式二维结构分析方法,并将语义和语境分析策略融入其中。实验表明,这种方法对公式结构分析具有较好的鲁棒性和应用前景。

关键词: 数学公式识别, 结构分析, 基准线, 空白域

Abstract: Mathematical expressions recognition is an important part of OCR technology. It is also a bottleneck in the development of recognition technology. To the structural analysis stage, which is a crucial course in printed formula recognition, this paper proposes a method which makes use of baseline and operator range with syntax analysis based on the introduction of the development state of mathematical expressions recognition. In experiments, this method shows robust adaptability for the structure of mathematical expressions, and will have a good foreground.

Key words: Mathematical expressions recognition, Structural analysis, Baseline, Operator rang