基于深度学习的生成式文本摘要技术综述

doi:10.19678/j.issn.1000-3428.0061174

摘要/Abstract

摘要： 在互联网数据急剧扩张和深度学习技术高速发展的背景下，自动文本摘要任务作为自然语言处理领域的主要研究方向之一，其相关技术及应用被广泛研究。基于摘要任务深化研究需求，以研究过程中存在的关键问题为导向，介绍现有基于深度学习的生成式文本摘要模型，简述定义及来源、数据预处理及基本框架、常用数据集及评价标准等，指出发展优势和关键问题，并针对关键问题阐述对应的可行性解决方案。对比常用的深度预训练模型和创新方法融合模型，分析各模型的创新性和局限性，提出对部分局限性问题的解决思路。进一步地，对该技术领域的未来发展方向进行展望总结。

关键词: 深度学习, 生成式文本摘要, 未登录词, 生成重复, 长程依赖, 评价标准

Abstract: Boosted by the rapid expansion of Internet data and the development of deep learning technologies, automatic text summarization is now one of the main research directions in the field of natural language processing.Its related technologies and applications have been widely studied.To assist further studies required by summarization tasks,and to help solve the key problems in the earlier studies,this paper introduces the existing abstractive text summarization models based on deep learning by briefly describing their definition and source,data preprocessing and basic framework,common data sets,and evaluation standards.Additionally,the paper gives the development advantages and key problems of the models,and elaborates on the corresponding feasible solutions.Then the paper compares the commonly used deep pre-trained models and innovative methods,analyzes the innovations and limits of each model,and gives corresponding solutions.Finally,the paper discusses the future development directions in this field.

Key words: deep learning, abstractive text summarization, Out of Vocabulary(OOV), generative repetition, long-term dependence, evaluation criteria

中图分类号:

TP391

朱永清, 赵鹏, 赵菲菲, 慕晓冬, 白坤, 尤轩昂. 基于深度学习的生成式文本摘要技术综述[J]. 计算机工程, 2021, 47(11): 11-21,28.

ZHU Yongqing, ZHAO Peng, ZHAO Feifei, MU Xiaodong, BAI Kun, YOU Xuanang. Survey on Abstractive Text Summarization Technologies Based on Deep Learning[J]. Computer Engineering, 2021, 47(11): 11-21,28.

http://www.ecice06.com/CN/Y2021/V47/I11/11

图/表 4

参考文献

[1] MIHALCEA R,TARAU P.TextRank:bringing order into texts[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing.[S.l.]:Association for Computational Linguistics,2004:1-8.
[2] ERKAN G,RADEV D.LexRank:graph-based lexical centrality as salience in text summarization[J].Journal of Artificial Intelligence Research,2004,22(1):457-479.
[3] YAN S,WAN X.SRRank:leveraging semantic roles for extractive multi-document summarization[J].IEEE/ACM Transactions on Audio Speech and Language Processing,2014,22(12):2048-2058.
[4] RADEV D R,JING H,STY M,et al.Centroid-based summarization of multiple documents[J].Information Processing and Management,2004,40(6):919-938.
[5] 任鹏杰.基于有监督深度学习的抽取式多文档自动摘要研究[D].济南:山东大学,2018. REN P J.Research on extractive multi-document summarization using supervised deep learning[D].Jinan:Shandong University,2018.(in Chinese)
[6] CAO Z,DONG L.Ranking with recursive neural networks and its application to multi-document summarization[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence.[S.l.]:AAAI Press,2013:1-5.
[7] CHEN L,NGUYEN M L.Sentence selective neural extractive summarization with reinforcement learning[C]//Proceedings of the 11th International Conference on Knowledge and Systems Engineering.Washington D.C.,USA:IEEE Press,2019:1-5.
[8] REN P,CHEN Z,REN Z,et al.Sentence relations for extractive summarization with deep neural networks[J].ACM Transactions on Information Systems,2018,36(4):1-32.
[9] LIN H,BILMES J.Multi-document summarization via budgeted maximization of submodular functions[C]//Proceedings of Conference of the North American Chapter of the Association of Computational Linguistics:Human Language Technologies.[S.l.]:Association for Computational Linguistics,2010:1-5.
[10] LIN H,BILMES J.A class of submodular functions for document summarization[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies.[S.l.]:Association for Computational Linguistics,2011:510-520.
[11] GU Y,HU Y.Extractive summarization with very deep pretrained language model[J].International Journal of Artificial Intelligence and Applications,2019,10(2):27-32.
[12] CHEN Y,MA Y,MAO X.et al.Multi-task learning for abstractive and extractive summarization[J].Data Science and Engineering,2019,4:14-23.
[13] XU S,ZHANG X,WU Y,et al.Unsupervised extractive summarization by pre-training hierarchical transformers[EB/OL].(2020-10-16)[2021-02-10].https://arxiv.org/pdf/2010.08242.pdf.
[14] 刘家益,邹益民.近70年文本自动摘要研究综述[J].情报科学,2017,35(7):156-163. LIU J Y,ZOU Y M.A review of automatic text summarization in recent 70 years[J].Information Science,2017,35(7):156-163.(in Chinese)
[15] 胡侠,林晔,王灿,等.自动文本摘要技术综述[J].情报杂志,2010,29(8):144-147. HU X,LIN Y,WANG C,et al.Summary of automatic text summarization technology[J].Journal of Information,2010,29(8):144-147.(in Chinese)
[16] KHAN A.A review on abstractive summarization methods[J].Journal of Theoretical & Applied Information Technology,2014,59(1):64-72.
[17] DALAL V,MALIK L G.A survey of extractive and abstractive text summarization techniques[C]//Proceedings of the 6th International Conference on Emerging Trends in Engineering and Technology.Washington D.C.,USA:IEEE Press,2013:1-5.
[18] ABDELALEEM N M,KADER H M A,SALEM R.A brief survey on text summarization techniques[J].International Journal of Electronics and Information Engineering,2019,10(2):76-89.
[19] GAMBHIR M,GUPTA V.Recent automatic text summarization techniques:a survey[J].Artificial Intelligence Review,2017,47(1):1-66.
[20] ALLAHYARI M,POURIYEH S,ASSEFI M,et al.Text summarization techniques:a brief survey[J].International Journal of Advanced Computer Science & Applications,2017,8(10):397-405.
[21] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.New York,USA:ACM Press,2014:3104-3112.
[22] PAN H X,LIU H,TANG Y.A sequence-to-sequence text summarization model with topic based attention mechanism[C]//Proceedings of International Conference on Web Information Systems and Applications.Berlin,Germany Springer:2019:285-297.
[23] TIAN S,KENESHLOO Y,RAMAKRISHNAN N,et al.Neural abstractive text summarization with sequence-to-sequence models[J].ACM Transactions on Data Science,2021,2(1):1-37.
[24] CINTAS C,OGALLO W,WALCOTT A,et al.Towards neural abstractive clinical trial text summarization with sequence to sequence models[C]//Proceedings of 2019 IEEE International Conference on Healthcare Informatics.Washington D.C.,USA:IEEE Press,2019:1-5.
[25] YUAN C,BAO Z,SANDERSON M,et al.Incorporating word attention with convolutional neural networks for abstractive summarization[J].World Wide Web,2020,23(1):267-287.
[26] NALLAPATI R,ZHOU B,SANTOS C N D,et al.Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning.Washington D.C.,USA:IEEE Press,2016:280-290.
[27] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[28] QUOC V N,THANH H L,MINH T L.Abstractive text summarization using LSTMs with rich features[C]//Proceedings of International Conference of the Pacific Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics,2019:28-40.
[29] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.Washington D.C.,USA:IEEE Press,2017:6000-6010.
[30] SU M H,WU C H,CHENG H T.A two-stage transformer-based approach for variable-length abstractive summari-zation[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2020,28:2061-2072.
[31] DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of 2019 Conference on the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.[S.l.]:Association for Computational Linguistics,2019:4171-4186.
[32] LIU Y,OTT M,GOYAL N,et al.RoBERTa:a robustly optimized bert pretraining approach[EB/OL].(2019-07-26)[2021-02-10].https://arxiv.org/pdf/1907.11692v1.pdf.
[33] RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[EB/OL].(2018-06-11)[2021-02-10].https://openai.com/blog/language-unsupervised.
[34] RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[EB/OL].[2021-02-10].https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.
[35] BROWN T B,MANN B,RYDER N,et al.Language models are few-shot learners[EB/OL].(2020-05-28)[2021-02-10].https://arxiv.org/pdf/2005.14165.pdf.
[36] LIU W,ZHOU P,ZHAO Z,et al.FastBERT:a self-distilling BERT with adaptive inference time[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics,2020:1-5.
[37] JIAO X,YIN Y,SHANG L,et al.TinyBERT:distilling BERT for natural language understanding[EB/OL].(2019-09-23)[2021-02-10].https://arxiv.org/pdf/1909.10351v3.pdf.
[38] HOU L,SHANG L,JIANG X,et al.DynaBERT:dynamic BERT with adaptive width and depth[EB/OL].(2020-04-08)[2021-02-10].https://arxiv.org/pdf/2004.04037v2.pdf.
[39] WANG Y,ZHOU L,ZHANG J,et al.Word,subword or character? An empirical study of granularity in Chinese-English NMT[C]//Proceedings of China Workshop on Machine Translation.Berlin,Germany:Springer,2017:30-42.
[40] BARRY C L.Document representations and clues to document relevance[J].Journal of the American Society for Information Science,2010,49(14):1293-1303.
[41] PENNINGTON J,SOCHER R,MANNING C.GloVe:global vectors for word representation[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing.[S.l.]:Association for Computational Linguistics,2014:1532-1543.
[42] KIM H K,KIM H,CHO S.Bag-of-concepts:comprehending document representation through clustering words in distributed representation[J].Neurocomputing,2017,266:336-352.
[43] RUSH A M,CHOPRA S,WESTON J,et al.A neural attention model for abstractive sentence summarization[C]//Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing.[S.l.]:Association for Computational Linguistics,2015:379-389.
[44] SEE A,LIU P J,MANNING C D.Get to the point:summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics,2017:1073-1083.
[45] CHAKRABORTY S,LI X Y,CHAKRABORTY S.A more abstractive summarization model[EB/OL].(2020-02-25)[2021-02-10].https://arxiv.org/pdf/2002.10959.pdf.
[46] CHUNG T L,XU B,LIU Y,et al.Main point generator:summarizing with a focus[C]//Proceedings of International Conference on Database Systems for Advanced Applications.Berlin,Germany:Springer,2018:924-932.
[47] LIN J,SUN X,MA S,et al.Global encoding for abstractive summarization[EB/OL].(2018-06-10)[2021-02-10].https://arxiv.org/pdf/1805.03989.pdf.
[48] COHAN A,DERNONCOURT F,KIM D S,et al.A discourse-aware attention model for abstractive summarization of long documents[C]//Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.[S.l.]:Association for Computational Linguistics,2018:615-621.
[49] CHOPRA S,AULI M,RUSH A M.Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.[S.l.]:Association for Compu-tational Linguistics,2016:93-98.
[50] CELIKYILMAZ A,BOSSELUT A,HE X,et al.Deep communicating agents for abstractive summarization[C]//Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.[S.l.]:Association for Computational Linguistics,2018:1662-1675.
[51] WEI Y,ZHANG H,LIN J.Simple applications of BERT for ad hoc document retrieval[EB/OL].(2019-05-26)[2021-02-10].https://arxiv.org/pdf/1903.10972.pdf.
[52] NG J P,ABRECHT V.Better summarization evaluation with word embeddings for ROUGE[C]//Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing.[S.l.]:Association for Compu-tational Linguistics,2015:1925-1930.
[53] AYANA,SHEN S,ZHAO Y,et al.Neural headline generation with sentence-wise optimization[EB/OL].(2016-04-07)[2021-02-10].https://arxiv.org/pdf/1604. 01904.pdf.
[54] LI W,YAO J,TAO Y,et al.A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence.New York,USA:ACM Press,2018:4453-4460.
[55] FABBRI A R,KRYŚCIŃSKI W,MCCANN B,et al.SummEval:re-evaluating summarization evaluation[J].Transactions of the Association for Computational Linguistics,2021,9(2):391-409.
[56] SONG K,TAN X,QIN T,et al.MASS:masked sequence to sequence pre-training for language generation[EB/OL].(2019-05-13)[2021-02-10].https://arxiv.org/pdf/1905. 02450v3.pdf.
[57] ZHENG C,ZHANG K,WANG H J,et al.Topic-aware abstractive text summarization[EB/OL].(2020-10-20)[2021-02-10].https://arxiv.org/pdf/2010.10323.pdf.
[58] DONG L,YANG N,WANG W,et al.Unified language model pre-training for natural language understanding and generation[EB/OL].(2019-05-08)[2021-02-10].https://arxiv.org/pdf/1905.03197.pdf.
[59] BAO H,DONG L,WEI F,et al.UniLMv2:pseudo-masked language models for unified language model pre-training[EB/OL].(2020-02-28)[2021-02-10].https://arxiv.org/pdf/2002.12804.pdf.
[60] RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[EB/OL].(2019-10-23)[2021-02-10].https://arxiv.org/pdf/1910.10683.pdf.
[61] ZOU Y,ZHANG X,LU W,et al.Pre-training for abstractive document summarization by reinstating source text[C]//Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing.[S.l.]:Association for Computational Linguistics,2020:1-5.
[62] LEWIS M,LIU Y,GOYAL N,et al.BART:denoising sequence-to-sequence pre-training for natural language generation,translation,and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics,2020:1-5.
[63] ZHANG J,ZHAO Y,SALEH M,et al.PEGASUS:pre-training with extracted gap-sentences for abstractive summarization[EB/OL].(2019-12-18)[2021-02-10].https://arxiv.org/pdf/1912.08777v1.pdf.
[64] YAN Y,QI W,GONG Y,et al.ProphetNet:predicting future n-gram for sequence-to-sequence pre-training[C]//Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing.[S.l.]:Association for Computational Linguistics,2020:1-5.
[65] GUO H,PASUNURU R,BANSAL M.Soft layer-specific multi-task summarization with entailment and question generation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics,2018:687-697.
[66] XU H,WANG Y,HAN K,et al.Selective attention encoders by syntactic graph convolutional networks for document summarization[C]//Proceedings of 2020 IEEE International Conference on Acoustics,Speech and Signal Processing.Washington D.C.,USA:IEEE Press,2020:8219-8223.
[67] ZHENG C,WANG H J,ZHANG K,et al.A baseline analysis for podcast abstractive summarization[EB/OL].(2020-08-24)[2021-02-10].https://arxiv.org/pdf/2008.10648v2.pdf.
[68] CHEN J,YANG D.Multi-view sequence-to-sequence models with conversational structure for abstractive dialogue summarization[C]//Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing.[S.l.]:Association for Computational Linguistics,2020:1-5.
[69] FABBRI A R,HAN S,LI H,et al.Improving zero and few-shot abstractive summarization with intermediate fine-tuning and data augmentation[EB/OL].2010-04-11.https://arxiv.org/abs/2010.12836.
[70] ZAGAR S,ROBNIK-SIKONJA M.Cross-lingual approach to abstractive summarization[EB/OL].(2020-12-08)[2021-02-10].https://arxiv.org/ftp/arxiv/papers/2012/2012.04307.pdf.
[71] GRAFF D,CHRISTOPHER C.English Gigaword[EB/OL].(2017-09-10)[2021-02-10].https://catalog.ldc.upenn.edu/LDC2003T05.
[72] HU B T,CHEN Q C,ZHU F Z,et al.LCSTS:a large scale chinese short text summarization dataset[C]//Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing.[S.l.]:Association for Compu-tational Linguistics,2015:1967-1972.
[73] LIN C.ROUGE:a package for automatic evaluation of summaries[C]//Proceedings of the Workshop on Text Summarization Branches Out.[S.l.]:Association for Computational Linguistics,2004:74-81.
[74] GAO S,CHEN X,LI P J,et al.Abstractive text summarization by incorporating reader comments[C]//Proceedings of the 33th AAAI Conference on Artificial Intelliqence.[S.l.]:Association for Computational Linguistics,2019:1-5.
[75] ZHANG T,KISHORE V,WU F,et al.BERTScore:evaluating text generation with BERT[EB/OL].(2020-02-24)[2021-02-10].https://arxiv.org/pdf/1904.09675.pdf.
[76] BHANDARI M,GOUR P N,ASHFAQ A,et al.Metrics also disagree in the low scoring range:revisiting summarization evaluation metrics[C]//Proceedings of the 28th International Conference on Computational Linguistics.[S.l.]:Association for Computational Linguistics,2020:1-5.
[77] ABADI M,BARHAM P,CHEN J,et al.TensorFlow:a system for large-scale machine learning[C]//Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation.[S.l.]:USENIX,2015:379-389.

选择文件类型/文献管理软件名称

选择包含的内容