作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (15): 202-204. doi: 10.3969/j.issn.1000-3428.2007.15.071

• 人工智能及识别技术 • 上一篇    下一篇

基于页面前景和最小二乘法的倾斜校正

陈 波1,王加俊1,吴 陈2   

  1. (1. 苏州大学电子信息学院,苏州 215021;2. 江苏科技大学电子信息学院,镇江 212003)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-08-05 发布日期:2007-08-05

Document Image Skew Correction Based on Page Layout Foreground and Least Square Method

CHEN Bo1, WANG Jia-jun1, WU Chen2   

  1. (1. School of Electronics and Information, Soochow University, Suzhou 215021; 2. School of Electronics and Information, Jiangsu University of Science and Technology, Zhenjiang 212003)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-08-05 Published:2007-08-05

摘要:

鉴于页面版面复杂,提出了一种基于页面前景和最小二乘法的倾斜校正方法。该方法用特定的模式描述页面前景像素,利用模式粗分类分离页面中可能有的图像、图形和表格,通过合并余下的模式得到最大的文字模式结构体,依据该结构体所含基线特征点用最小二乘法拟合出基线方向即页面倾斜方向。实验表明该方法是有效的,速度快,它得到的模式结构体可以继续用来做版面分析。

关键词: 倾斜校正, 模式结构体, 基线特征点, 版面分析

Abstract: For the complexity of document images, this paper proposes a method based on page’s layout foreground and least square method. In this method, foreground pixels are described by special patterns. Halftones, graphics and forms are excluded from the document images by pattern classification. The biggest pattern structure is obtained after merging the rest character pattern. The skew angle is counted by using the least square method according to the points, which is obtained by searching the biggest pure text pattern structure. Experimental result shows the fastness and effectiveness of the proposed algorithm. A most prominent superiority of this method is that patterns obtained in the process of skew angle detection can be used for further layout analysis.

Key words: skew correction, pattern structure, characteristic dots on baseline, page layout analysis

中图分类号: