作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2021, Vol. 47 ›› Issue (11): 11-21,28. doi: 10.19678/j.issn.1000-3428.0061174

• 热点与综述 • 上一篇    下一篇

基于深度学习的生成式文本摘要技术综述

朱永清1, 赵鹏1, 赵菲菲2, 慕晓冬1, 白坤1, 尤轩昂1   

  1. 1. 火箭军工程大学 作战保障学院, 西安 710025;
    2. 陆军边海防学院, 西安 710025
  • 收稿日期:2021-03-17 修回日期:2021-05-09 发布日期:2021-11-09
  • 作者简介:朱永清(1992-),男,硕士研究生,主研方向为智能信息处理、自然语言处理;赵鹏,副教授、博士;赵菲菲,讲师;慕晓冬,教授、博士、博士生导师;白坤、尤轩昂,硕士研究生。
  • 基金资助:
    国家部委基金。

Survey on Abstractive Text Summarization Technologies Based on Deep Learning

ZHU Yongqing1, ZHAO Peng1, ZHAO Feifei2, MU Xiaodong1, BAI Kun1, YOU Xuanang1   

  1. 1. College of Operational Support, Rocket Force University of Engineering, Xi'an 710025, China;
    2. Army Academy of Border and Coastal Defence, Xi'an 710025, China
  • Received:2021-03-17 Revised:2021-05-09 Published:2021-11-09

摘要: 在互联网数据急剧扩张和深度学习技术高速发展的背景下,自动文本摘要任务作为自然语言处理领域的主要研究方向之一,其相关技术及应用被广泛研究。基于摘要任务深化研究需求,以研究过程中存在的关键问题为导向,介绍现有基于深度学习的生成式文本摘要模型,简述定义及来源、数据预处理及基本框架、常用数据集及评价标准等,指出发展优势和关键问题,并针对关键问题阐述对应的可行性解决方案。对比常用的深度预训练模型和创新方法融合模型,分析各模型的创新性和局限性,提出对部分局限性问题的解决思路。进一步地,对该技术领域的未来发展方向进行展望总结。

关键词: 深度学习, 生成式文本摘要, 未登录词, 生成重复, 长程依赖, 评价标准

Abstract: Boosted by the rapid expansion of Internet data and the development of deep learning technologies, automatic text summarization is now one of the main research directions in the field of natural language processing.Its related technologies and applications have been widely studied.To assist further studies required by summarization tasks,and to help solve the key problems in the earlier studies,this paper introduces the existing abstractive text summarization models based on deep learning by briefly describing their definition and source,data preprocessing and basic framework,common data sets,and evaluation standards.Additionally,the paper gives the development advantages and key problems of the models,and elaborates on the corresponding feasible solutions.Then the paper compares the commonly used deep pre-trained models and innovative methods,analyzes the innovations and limits of each model,and gives corresponding solutions.Finally,the paper discusses the future development directions in this field.

Key words: deep learning, abstractive text summarization, Out of Vocabulary(OOV), generative repetition, long-term dependence, evaluation criteria

中图分类号: