作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (2): 32-33,47.

• 博士论文 • 上一篇    下一篇

基于图论最大匹配的非 Manhattan 版面阅读顺序

贾娟 1,陈堃銶1,周东浩2   

  1. 1.北京大学计算机科学技术研究所,文字信息处理技术国家重点实验室,北京 100871;2. IBM 中国有限公司,北京 100027
  • 出版日期:2006-01-20 发布日期:2006-01-20

Reading Order Based on Maximal Matching in Graph Theory for Non-manhattan Layout

JIA Juan1, CHEN Kunqiu1, ZHOU Donghao2   

  1. 1. National Key Laboratory for Text Information Processing Technology, Institute of Computer Science and Technology,Peking University, Beijing 100871; 2. IBM China Co. Ltd., Beijing 100027
  • Online:2006-01-20 Published:2006-01-20

摘要: 非Manhattan 版面中,区域形状不规则及空间关系复杂使得确定合乎视觉脉络的无歧义的文字阅读顺序成为排版及版面理解过程中的一个难点。针对此问题,建立了新的版面布局模型,提出了基于图论最大匹配理论的阅读顺序确定算法。已成功运用于专业中日文排版系统,取得了满意的效果,并对更深入研究文档图像理解具有十分重要的理论和实践意义。

关键词: 最大匹配;非Manhattan 版面;阅读顺序;空间关系

Abstract: Detecting reading order for non Manhattan layout which has anomalous shape and complicated space relationship is a key problem in research of document image understanding (DIU) and text typesetting. To resolve it, a new layout model is defined which uses layout objects, space inclusive and ordinal relationship. Based on maximal matching in graph theory, an algorithm for reading order detection is presented. It is proven be effective by a special typesetting system and also helpful to go deep into DIU.

Key words: Maximal match; Non Manhattan layout; Reading order; Spatial relation