作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (15): 231-233. doi: 10.3969/j.issn.1000-3428.2008.15.083

• 工程应用技术与实现 • 上一篇    下一篇

彩色文档图像的版面分析

黄海凌,刘列根,张 宇   

  1. (华南理工大学计算机应用工程研究所,广州 510641)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-08-05 发布日期:2008-08-05

Color Document Image Layout Analysis

HUANG Hai-ling, LIU Lie-gen, ZHANG Yu   

  1. (Research Institution of Computer Application Engineering, South China University of Technology, Guangzhou 510641)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-08-05 Published:2008-08-05

摘要: 文档图像处理技术是实现对网络上以“图片化”形式发送的垃圾邮件进行检测和过滤的有效手段。该文对彩色文档图像的版面进行分析,目的是分割出图像中的特定目标,便于分析并检测出文档图像中是否含有特别字符信息,从而使得网络垃圾邮件过滤系统可以根据这些信息判断是否过滤该邮件。实验结果表明,上述方法可以在不同颜色深度和不同几何结构的彩色文档图像中进行有效的检测,具有较好的实用性和应用价值。

关键词: 文档图像, 版面分析, 连通元, 归一

Abstract: Document image analysis technology provides an effective tool for filtering junk mails in a graphic form. The aim of analyzing the color document image layout is to segment particular objects in the document image, so that the downstream steps can analyze and inspect whether there are special words in the document image. The network junk-mail-filter system can use this information to identify whether to filter the mail or not. Experiments on this system show that the method is efficient in inspecting different color and gray document images with different geometric structure. The proposed method has potential applications in document image information extraction and filtering.

Key words: document image, layout analysis, connected components, normalization

中图分类号: