作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (3): 16-27. doi: 10.19678/j.issn.1000-3428.0067427

• 热点与综述 • 上一篇    下一篇

基于深度学习的自然场景文本检测综述

连哲*(), 殷雁君, 云飞, 智敏   

  1. 内蒙古师范大学计算机科学技术学院, 内蒙古 呼和浩特 010022
  • 收稿日期:2023-04-20 出版日期:2024-03-15 发布日期:2024-03-22
  • 通讯作者: 连哲
  • 基金资助:
    内蒙古自治区自然科学基金(2021LHMS06009); 内蒙古自治区高等学校科学研究项目(NJZZ21004)

Review of Natural Scene Text Detection Based on Deep Learning

Zhe LIAN*(), Yanjun YIN, Fei YUN, Min ZHI   

  1. School of Computer Science and Technology, Inner Mongolia Normal University, Hohhot 010022, Inner Mongolia, China
  • Received:2023-04-20 Online:2024-03-15 Published:2024-03-22
  • Contact: Zhe LIAN

摘要:

基于深度学习的自然场景文本检测技术已成为计算机视觉和自然语言处理领域的重要研究方向,不仅具有广泛的应用前景,而且也为研究人员提供了一个探索神经网络模型和算法的新平台。首先,介绍自然场景文本检测技术的相关概念、研究背景和发展现状。接着,分析近年来基于深度学习的文本检测方法并将其分为基于检测框、基于分割、基于两者混合、其他4类,阐述4类经典和主流方法的基本思路和主要算法流程,归纳总结不同方法的使用机制、适用场景、优劣点及仿真实验结果和环境设置,明确不同方法之间的关联关系。然后,介绍自然场景文本检测的常用公共数据集和文本检测性能评估方法。最后,指出基于深度学习的自然场景文本检测技术目前所面临的主要挑战并对其未来发展方向进行展望。

关键词: 深度学习, 计算机视觉, 自然场景文本, 文本检测, 多方向文本检测, 多尺度文本检测

Abstract:

Natural scene text detection technology based on deep learning has become a crucial research focal point in the fields of computer vision and natural language processing. Not only does it possess a wide range of potential applications but also serves as a new platform for researchers to explore neural network models and algorithms. First, this study introduces the relevant concepts, research background, and current developments in natural scene text detection technology. Subsequently, an analysis of recent deep learning-based text detection methods is performed, categorizing them into four classes: detection boxes-, segmentation-, detection-boxes and segmentation-based, and others. The fundamental concepts and main algorithmic processes of classical and mainstream methods within these four categories are elaborated, summarizing the usage mechanisms, applicable scenarios, advantages, disadvantages, simulation experimental results, and environment settings of different methods, while clarifying their interrelationships. Thereafter, common public datasets and performance evaluation methods for natural scene text detection are introduced. Finally, the major challenges facing current deep learning-based natural scene text detection technology are outlined, and future development directions are discussed.

Key words: deep learning, computer vision, natural scene text, text detection, multi-directional text detection, multi-scale text detection