基于双编码器结构的文本自动摘要研究

doi:10.19678/j.issn.1000-3428.0054540

计算机工程 ›› 2020, Vol. 46 ›› Issue (6): 60-64. doi: 10.19678/j.issn.1000-3428.0054540

基于双编码器结构的文本自动摘要研究

冯读娟, 杨璐, 严建峰

苏州大学计算机科学与技术学院, 江苏苏州 215006

收稿日期:2019-04-09 修回日期:2019-06-11 发布日期:2019-06-22
作者简介:冯读娟(1994-),女,硕士研究生,主研方向为自然语言处理、自动摘要;杨璐、严建峰,副教授。
基金资助:
国家自然科学基金（61572339，61272449）；江苏省科技支撑计划重点项目（BE2014005）。

Research on Automatic Text Summarization Based on Dual-Encoder Structure

FENG Dujuan, YANG Lu, YAN Jianfeng

School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China

Received:2019-04-09 Revised:2019-06-11 Published:2019-06-22

摘要/Abstract

摘要： 为了解决序列到序列模型中编码器不能充分编码源文本的问题，构建一种基于双编码器网络结构的CGAtten-GRU模型。2个编码器分别使用卷积神经网络和双向门控循环单元，源文本并行进入双编码器，结合2种编码网络结构的输出结果构建注意力机制，解码器端使用GRU网络融合Copy机制和集束搜索方法，以提高解码的准确度。在大规模中文短文本摘要数据集LCSTS上的实验结果表明，与RNN context模型相比，该模型的Rouge-1、Rouge-2和Rouge-L分别提高0.1、0.059和0.046。

关键词: 自然语言处理, 生成式摘要, 卷积神经网络, 门控循环单元, 注意力机制, 序列到序列模型, Copy机制

Abstract: This paper constructs a CGAtten-GRU model based on dual-encoder network structure to solve the problem that the encoder cannot fully encode the source text in the sequence-to-sequence(seq2seq) model.The two encoders use Convolutional Neural Network(CNN) and Bidirectional Gated Recurrent Unit(BiGRU) respectively,and the source text enters the two encoders in parallel.An attention mechanism is constructed by means of the outputs of two encoding networks.The decoder uses GRU network combining the Copy mechanism and the beam search method to improve the accuracy of decoding.Experimental results on large-scale Chinese short text summarization dataset LCSTS show that compared with the RNN context model,the proposed model improves Rouge-1 by 0.1,Rouge-2 by 0.059,and Rouge-L by 0.046.

Key words: Natural Language Processing(NLP), abstractive summarization, Convolutional Neural Network(CNN), Gated Recurrent Unit(GRU), attention mechanism, sequence-to-sequence(seq2seq) model, Copy mechanism

中图分类号:

TP391

冯读娟, 杨璐, 严建峰. 基于双编码器结构的文本自动摘要研究[J]. 计算机工程, 2020, 46(6): 60-64.

FENG Dujuan, YANG Lu, YAN Jianfeng. Research on Automatic Text Summarization Based on Dual-Encoder Structure[J]. Computer Engineering, 2020, 46(6): 60-64.

http://www.ecice06.com/CN/Y2020/V46/I6/60

图/表 4

参考文献

[1] PENG Min,GAO Binlong,HUANG Jimin,et al.Automatic summarization of microblog based on high quality information extraction[J].Computer Engineering,2015,41(7):36-42.(in Chinese)彭敏,高斌龙,黄济民,等.基于高质量信息提取的微博自动摘要[J].计算机工程,2015,41(7):36-42.
[2] RUSH A M,CHOPRA S,WESTON J.A neural attention model for abstractive sentence summarization[EB/OL].[2019-03-20].https://arxiv.org/pdf/1509.00685.pdf.
[3] CHOPRA S,AULI M,RUSH A M.Abstractive sentence summarization with attentive recurrent neural networks[EB/OL].[2019-03-20].http://aclweb.org/anthology/N/N16/N16-1012.pdf.
[4] NALLAPATI R,ZHOU B,GULCEHRE C,et al.Abstractive text summarization using sequence-to-sequence RNNs and beyond[EB/OL].[2019-03-20].https://arxiv.org/pdf/1602.06023.pdf.
[5] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2019-03-20].http://de.arxiv.org/pdf/1409.0473.
[6] SEE A,LIU P J,MANNING C D.Get to the point:summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Washington D.C.,USA:IEEE Press,2017:1073-1083.
[7] GEHRING J,AULI M,GRANGIER D,et al.A convolutional encoder model for neural machine translation[EB/OL].[2019-03-20].http://wing.comp.nus.edu.sg/~antho/P/P17/P17-1012.pdf.
[8] LIN C Y,HOVY E.Automatic evaluation of summaries using N-gram co-occurrence statistics[C]//Proceedings of 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.[S.l.]:ACL,2003:26-35.
[9] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[10] CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].[2019-03-20].http://de.arxiv.org/pdf/1406.1078.
[11] YOON K.Convolutional neural networks for sentence classification[EB/OL].[2019-03-20].http://de.arxiv.org/pdf/1408.5882.
[12] MENG F,LU Z,WANG M,et al.Encoding source language with convolutional neural network for machine translation[EB/OL].[2019-03-20].http://aclweb.org/anthology/P/P15/P15-1003.pdf.
[13] PANG Lei,LI Shoushan,ZHOU Guodong.Sentiment classification method of Chinese micro-blog based on emotional knowledge[J].Computer Engineering,2012,38(13):156-158,162.(in Chinese)庞磊,李寿山,周国栋.基于情绪知识的中文微博情感分类方法[J].计算机工程,2012,38(13):156-158,162.
[14] CHO K,VAN MERRIËNBOER B,BAHDANAU D,et al.On the properties of neural machine translation:encoder-decoder approaches[EB/OL].[2019-03-20].http://de.arxiv.org/pdf/1409.1259.
[15] YOUNG T,HAZARIKA D,PORIA S,et al.Recent trends in deep learning based natural language processing[J].IEEE Computational Intelligence Magazine,2018,13(3):55-75.
[16] SCHUSTER M,PALIWAL K K.Bidirectional recurrent neural networks[J].IEEE Transactions on Signal Processing,1997,45(11):2673-2681.
[17] ZHOU Feiyan,JIN Linpeng,DONG Jun.Review of convolutional neural network[J].Chinese Journal of Computers,2017,40(6):1229-1251.(in Chinese)周飞燕,金林鹏,董军.卷积神经网络研究综述[J].计算机学报,2017,40(6):1229-1251.
[18] GOODFELLOW I J,WARDE-FARLEY D,MIRZA M,et al.Maxout networks[J].Computer Science,2013,39:1319-1327.
[19] ZHOU Q,YANG N,WEI F,et al.Neural question generation from text:a preliminary study[C]//Proceedings of National CCF Conference on Natural Language Processing and Chinese Computing.Berlin,Germany:Springer,2017:662-671.
[20] GULCEHRE C,AHN S,NALLAPATI R,et al.Pointing the unknown words[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL,2016:256-263.
[21] FREITAG M,AL-ONAIZAN Y.Beam search strategies for neural machine translation[C]//Proceedings of the 1st Workshop on Neural Machine Translation.[S.l.]:ACL,2017:56-60.
[22] HU B,CHEN Q,ZHU F.LCSTS:a large scale Chinese short text summarization dataset[J].Computer Science,2015,41:2667-2671.
[23] KINGMA D P,BA J.Adam:a method for stochastic optimization[EB/OL].[2019-03-20].https://arxiv.org/pdf/1412.6980.pdf.

选择文件类型/文献管理软件名称

选择包含的内容

基于双编码器结构的文本自动摘要研究

Research on Automatic Text Summarization Based on Dual-Encoder Structure

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 4

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	丰芳宇, 罗晓曙, 蒙志明, 王广宇. 基于抗混叠残差注意力网络的人脸表情识别[J]. 计算机工程, 2023, 49(8): 190-198.
[2]	王书朋, 何引弟. 融合特征注意力机制的非均匀光照图像增强算法[J]. 计算机工程, 2023, 49(8): 232-239.
[3]	刘昊鑫, 董超, 勾智楠, 高凯. 融合混合表征的小样本关系抽取方法[J]. 计算机工程, 2023, 49(8): 63-68.
[4]	杨长沛, 廖列法. 基于门控空洞卷积特征融合的中文命名实体识别[J]. 计算机工程, 2023, 49(8): 85-95.
[5]	刘俊豪, 王美林, 谢兴, 宋烨兴, 许莉花. 基于改进YOLOv5的皮革瑕疵检测算法[J]. 计算机工程, 2023, 49(8): 240-249.
[6]	马娜, 温廷新, 贾旭, 李晓会. 复杂光照条件下自适应的车脸重识别模型[J]. 计算机工程, 2023, 49(8): 275-282, 290.
[7]	陈露萌, 曹彦彦, 黄民, 谢鑫钢. 基于改进YOLOv5的火焰检测方法[J]. 计算机工程, 2023, 49(8): 291-301, 309.
[8]	李强龙, 周新文, 位梦恩, 甘阳洲. 基于条形池化和注意力机制的街道场景红外目标检测算法[J]. 计算机工程, 2023, 49(8): 310-320.
[9]	曹坪, 杨怀志, 薄一军, 尤嘉, 张淳杰, 李丹勇. 面向低质量裂缝图像的多知识蒸馏分类[J]. 计算机工程, 2023, 49(7): 204-213.
[10]	白明昌. 基于折叠路径聚合的属性网络节点嵌入方法[J]. 计算机工程, 2023, 49(7): 76-84.
[11]	费蓉, 马梦阳, 张晓, 黑新宏, 徐庆征, 邱原. 基于轨迹预测与冲突检测的自动驾驶碰撞检测模型[J]. 计算机工程, 2023, 49(7): 10-20.
[12]	郭艳霞, 金勇, 唐宏, 彭金枝. 基于动态卷积与残差门控的多模态情感识别[J]. 计算机工程, 2023, 49(7): 94-101.
[13]	刘豪, 吴红兰, 房宇轩. 结合全局上下文信息的高效人体姿态估计[J]. 计算机工程, 2023, 49(7): 102-109.
[14]	张家熔, 苑津莎, 许珈宁, 罗志宏. 基于多元信息嵌入与协同神经网络的力学实体识别算法[J]. 计算机工程, 2023, 49(7): 125-134.
[15]	吴珊, 周凤. 基于改进SSD算法的小目标检测[J]. 计算机工程, 2023, 49(7): 179-188.

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于双编码器结构的文本自动摘要研究

Research on Automatic Text Summarization Based on Dual-Encoder Structure

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 4

参考文献

相关文章 15

编辑推荐

Metrics

本文评价