融合文本分类的多任务学习摘要模型

doi:10.19678/j.issn.1000-3428.0057448

摘要/Abstract

摘要： 文本摘要应包含源文本中所有重要信息，传统基于编码器-解码器架构的摘要模型生成的摘要准确性较低。根据文本分类和文本摘要的相关性，提出一种多任务学习摘要模型。从文本分类辅助任务中学习抽象信息改善摘要生成质量，使用K-means聚类算法构建Cluster-2、Cluster-10和Cluster-20文本分类数据集训练分类器，并研究不同分类数据集参与训练对摘要模型的性能影响，同时利用基于统计分布的判别法全面评价摘要准确性。在CNNDM测试集上的实验结果表明，该模型在ROUGE-1、ROUGE-2和ROUGE-L指标上相比强基线模型分别提高了0.23、0.17和0.31个百分点，生成摘要的准确性更高。

关键词: 编码器-解码器架构, 文本摘要, 文本分类, 多任务学习, 聚类算法, 统计分布

Abstract: The text summary should include all the important information in the source text,but the summaries generated by traditional summarization models based on encoder-decoder architecture are not accurate.Based on the correlation between text classification and text summarization,this paper proposes a summarization model using Multi-Task Learning(MTL).The model learns abstract information from the auxiliary tasks of text classification to improve the quality of generated summaries.The K-means clustering algorithm is used to construct text classification datasets Cluster-2,Cluster-10 and Cluster-20 to train the classifier.On this basis,the impact of different classification datasets participating in the training on the performance of the summarization model is studied,and a discriminant method based on statistical distribution is proposed to reflect the accuracy of the summary. Experimental results on the CNNDM test set show that the proposed model improves the ROUGE-1,ROUGE-2 and ROUGE-L indexes by 0.23,0.17 and 0.31 percentage points compared with the strong baseline model,which demonstrates the summaries generated by this model are more accurate.

Key words: encoder-decoder architecture, text summarization, text classification, Multi-Task Learning(MTL), clustering algorithm, statistical distribution

中图分类号:

TP391

周伟枭, 蓝雯飞. 融合文本分类的多任务学习摘要模型[J]. 计算机工程, 2021, 47(4): 48-55.

ZHOU Weixiao, LAN Wenfei. Summarization Model Using Multi-Task Learning Fused with Text Classification[J]. Computer Engineering, 2021, 47(4): 48-55.

https://www.ecice06.com/CN/Y2021/V47/I4/48

图/表 9

20210425164857

20210425164900

20210425164902

20210425164905

20210425164917

20210425164923

20210425164925

20210425164929

20210425164955

参考文献

[1] EVANGELOPOULOS G,ZLATINTSI A.Multimodal saliency and fusion for movie summarization based on aural,visual,and textual attention[J].IEEE Transactions on Multimedia,2014,15(7):1553-1568.
[2] JAYANTH J,SUNDARARAJ J,BHATTACHARYYA P.Monotone submodularity in opinion summaries[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing.Philadelphia,USA:ACL Press,2015:169-178.
[3] ZHANG Jiajun,ZHOU Yu,ZONG Chengqing.Abstractive cross-language summarization via translation model enhanced predicate argument structure fusing[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2016,24(10):1842-1853.
[4] LIN H,BILMES J.Multi-document summarization via budgeted maximization of submodular functions[C]//Proceedings of 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics.New York,USA:ACM Press,2010:912-920.
[5] TAN Jiwei,WAN Xiaojun,XIAO Jianguo.Abstractive document summarization with a graph-based attentional neural model[C]//Proceedings of the 55th Annual Meet-ing of the Association for Computational Linguistics.Philadelphia,USA:ACL Press,2017:1171-1181.
[6] LUONG M,LE Q,SUTSKEVER I,et al.Multi-task sequence to sequence learning[EB/OL].[2020-01-06].https://arxiv.org/pdf/1511.06114.pdf.
[7] MA Shuming,SUN Xu,LIN Junyang,et al.A hierarchical end-to-end model for jointly improving text summarization and sentiment classification[EB/OL].[2020-01-06].https://arxiv.org/pdf/1805.01089.pdf.
[8] CHOPRA S,AULI M,RUSH A.Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Philadelphia,USA:ACL Press,2016:93-98.
[9] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2020-01-06].https://arxiv.org/pdf/1409.0473.pdf.
[10] VINYALS O,FORTUNATO M,JAITLY N.Pointer networks[C]//Proceedings of Advances in Neural Infor-mation Processing Systems.Cambridge,USA:MIT Press,2015:2692-2700.
[11] GU Jiatao,LU Zhengdong,LI Hang,et al.Incorporating copying mechanism in sequence-to-sequence learning[EB/OL].[2020-01-06].https://arxiv.org/abs/1603.06393.
[12] TU Zhaopeng,LU Zhengdong,LIU Yang,et al.Modeling coverage for neural machine translation[EB/OL].[2020-01-06].https://arxiv.org/pdf/1601.04811.pdf.
[13] SEE A,LIU P,MANNING C D.Get to the point:summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Philadelphia,USA:ACL Press,2017:1073-1083.
[14] GUO Han,PASUNURU R,BANSAL M.Soft layer-specific multi-task summarization with entailment and question generation[EB/OL].[2020-01-06].https://arxiv.org/pdf/1805.11004.pdf.
[15] ZHU Junnan,WANG Qian,WANG Yining,et al.NCLS:neural cross-lingual summarization[C]//Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Philadelphia,USA:ACL Press,2019:3045-3055.
[16] MISHRA A,TAMILSELVAM S,RIDDHIMAN D,et al.Cognition-cognizant sentiment analysis with multitask subjectivity summarization based on annotators' gaze behavior[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Palo Alto,USA:AAAI Press,2018:1-8.
[17] GEHRMANN S,DENG Y T,RUSH A.Bottom-up abstractive summarization[C]//Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing.Philadelphia,USA:ACL Press,2018:4098-4109.
[18] ZHU Junnan,LI Haoran,LIU Tianshang,et al.MSMO:multimodal summarization with multimodal output[C]//Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing.Philadelphia,USA:ACL Press,2018:4154-4164.
[19] LI Haoran,ZHU Junnan,MA Cong,et al.Multi-modal summarization for asynchronous collection of text,image,audio and video[C]//Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing.Philadelphia,USA:ACL Press,2017:1103-1113.
[20] LI Haoran,ZHU Junnan,MA Cong,et al.Read,watch,listen,and summarize:multi-modal summarization for asynchronous text,image,audio and video[J].IEEE Transactions on Knowledge and Data Engineering,2018,31(5):996-1009.
[21] ZHU Junnan,ZHOU Yu,ZHANG Jiajun,et al.Multimodal summarization with guidance with multimodal reference[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence.Palo Alto,USA:AAAI Press,2020:63-70.
[22] CELIKYILMAZ A,BOSSELUT A,HE X D,et al.Deep communicating agents for abstractive summarization[C]//Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Philadelphia,USA:ACL Press,2018:1662-1675.
[23] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of Advances in Neural Information Processing Systems.Cambridge,USA:MIT Press,2017:5998-6008.
[24] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[25] CHO K,MERRIENBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].[2020-01-06].https://arxiv.org/pdf/1406.1078.pdf.
[26] HERMANN M,KOCISKY T,GREFENSTETTE E,et al.Teaching machines to read and comprehend[EB/OL].[2020-01-06].https://arxiv.org/pdf/1506.03340.pdf.
[27] LIN C.ROUGE:a package for automatic evaluation of summaries[C]//Proceedings of Workshop on Text Summarization Branches Out.Philadelphia,USA:ACL Press,2004:74-81.
[28] DUCHI J,HAZAN E,SINGER Y.Adaptive subgradient methods for online learning and stochastic optimization[J].Journal of Machine Learning Research,2011,12(7):2121-2159.
[29] KINGMA D,BA J.Adam:a method for stochastic optimization[EB/OL].[2020-01-06].https://arxiv.org/pdf/1412.6980v8.pdf.

选择文件类型/文献管理软件名称

选择包含的内容