基于双向注意力机制的多文档神经阅读理解

doi:10.19678/j.issn.1000-3428.0056056

摘要/Abstract

摘要： 机器阅读理解是一项针对给定文本和特定问题自动生成或抽取相应答案的问答任务，该任务是评估计算机系统对自然语言理解程度的重要任务之一。相比于传统的阅读理解任务，多文档阅读理解需要计算模型具备更高的推理和理解能力。为此，提出一种基于多任务联合训练的阅读理解模型，该模型是由一组功能各异的神经网络构成的联合学习模型，其仿效人们推理和回答问题的基本方式分别执行文档选择和答案抽取两个关键步骤。文档选择过程融入了基于注意力矩阵的关联性判别机制，旨在建立各文档间的联系，而答案抽取过程则使用了语篇级的双向注意力机制，来找寻与答案相关的文字线索，将两者附着于一套神经阅读理解模型上，可形成一种基于联合学习的多文档阅读理解方法。在HotpotQA数据集上的实验结果表明，与基线模型相比，该模型的EM值和F1值分别提升了2.1%和1.7%。

关键词: 机器阅读理解, 多文档, 推理, 联合训练, 注意力机制

Abstract: Machine Reading Comprehension(MRC) is a question and answer task that automatically generates or extracts corresponding answers for a given text and specific questions.This task is of great significance to evaluating the understanding of computer systems for natural languages.Compared with traditional reading comprehension tasks,multi-document reading comprehension requires computation models with higher reasoning and comprehension capabilities.Therefore,this paper proposes a reading comprehension model based on multi-task joint training.The model is a joint learning model composed of a set of neural networks with different functions.It executes the two key steps,document selection and answer extraction,by imitating the basic way people reason and answer questions.The document selection process incorporates a relevance discrimination mechanism based on the attention matrix,which aims to establish the relationship between documents,while the answer extraction process uses a text-level bi-directional attention mechanism to find text clues related to the answer.The two parts are attached to a set of neural reading comprehension models to form a multi-document reading comprehension method based on joint learning.Experimental results on the HotpotQA dataset show that compared with the baseline model,the proposed model increases the EM value and F1 value by 2.1% and 1.7%,respectively.

Key words: Machine Reading Comprehension(MRC), multi-document, reasoning, joint training, attention mechanism

中图分类号:

TP18

唐竑轩, 武恺莉, 朱朦朦, 洪宇. 基于双向注意力机制的多文档神经阅读理解[J]. 计算机工程, 2020, 46(12): 43-51.

TANG Hongxuan, WU Kaili, ZHU Mengmeng, HONG Yu. Multi-Document Neural Reading Comprehension Based on Bi-Directional Attention Mechanism[J]. Computer Engineering, 2020, 46(12): 43-51.

http://www.ecice06.com/CN/Y2020/V46/I12/43

图/表 11

20201216140328

20201216140334

20201216140337

20201216140340

20201216140343

20201216140347

20201216140351

20201216140354

20201216140358

20201216140401

20201216140405

参考文献

[1] CHOI E,WELD D S.TriviaQA:a large scale distantly supervised challenge dataset for reading comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Seattle,USA:[s.n.],2017:17-47.
[2] HE Wei,LIU Kai,LIU Jing,et al.Dureader:a chinese machine reading comprehension dataset from real-world applications[EB/OL].[2019-08-10].https://www.researchgate.net/publication.
[3] YANG Zilin,QI Peng,ZHANG Saizheng,et al.Hotpotqa:a dataset for diverse,explainable multi-hop question answering[C]//Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing.Montreal,Canada:[s.n.],2018:1-10.
[4] DAI Z,XIONG C,CALLAN J,et al.Convolutional neural networks for soft-matching n-grams in ad-hoc search[C]//Proceedings of the 11th ACM International Conference on Web Search and Data Mining.New York,USA:ACM Press,2018:126-134.
[5] BLANCO R,OTTAVIANO G,MEIJ E.Fast and space-efficient entity linking for queries[C]//Proceedings of the 8th ACM International Conference on Web Search and Data Mining.New York,USA:ACM Press,2015:179-188.
[6] SANH V,WOLF T,RUDER S.A hierarchical multi-task approach for learning embeddings from semantic tasks[C]//Proceedings of AAAI Conference on Artificial Intelligence.[S.1.]:AAAI Press,2018:125-136.
[7] CHUNG J,GULCEHRE C,CHO K H,et al.Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL].[2019-08-10].https://www.researchgate.net/publication.
[8] MIKOLOV T,KARAFIAT M,BURGET L,et al.Recurrent neural network based language model[C]//Proceedings of the 11th Annual Conference of the International Speech Communication Association.Washington D.C.,USA:IEEE Press,2010:365-378.
[9] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2019-08-10].https://arxiv.org/abs/1409. 0473.
[10] VINYALS O,FORTUNATO M,JAITLY N.Pointer networks[C]//Proceedings of ANIPS'15.Washington D.C.,USA:IEEE Press,2015:2692-2700.
[11] DDVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[EB/OL].[2019-08-10].https://arxiv.org/abs/1810.04805.
[12] RAJPURKAR P,ZHANG J,LOPYREV K,et al.Squad:100000+ questions for machine comprehension of text[C]//Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing.Washington D.C.,USA:IEEE Press,2016:158-169.
[13] WANG Shuohang,JIANG Jing.Machine comprehension using match-LSTM and answer pointer[EB/OL].[2019-08-10].https://www.researchgate.net/publication/307302995.
[14] SEO M,KEMBHAVI A,FARHADI A,et al.Bidirectional attention flow for machine comprehension[EB/OL].[2019-08-10].https://www.researchgate.net/publication/309738677.
[15] HU Menghao,PENG Yuxing,HUANG Zhen,et al.Reinforced mnemonic reader for machine reading comprehension[EB/OL].[2019-08-10].https://www.researchgate.net/publication/316780375.
[16] TRISCHLER A,YE Z,YUAN X,et al.Natural language comprehension with the epireader[C]//Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing.Austin,USA:Association for Computational Linguistics,2016:128-137.
[17] CLARK C,GARDNER M.Simple and effective multi-paragraph reading comprehension[EB/OL].[2019-08-10].https://arxiv.org/abs/1710.10723.
[18] WANG Shuohang,YU Mo,GUO Xiaoxiao,et al.Reinforced reader-ranker for open-domain question answering[EB/OL].[2019-08-10].https://arxiv.org/pdf/1709.00023v2.pdf.
[19] WANG Yizhang,LIU Kai,LIU Jing,et al.Multi-passage machine reading comprehension with cross-passage answer verification[EB/OL].[2019-08-10].https://arxiv.org/pdf/1805.02220.pdf.
[20] LIU Jiahua,WEI Wan,CHEN Hao,et al.Machine reading comprehension for multi-document and multi-answer[J].Journal of Chinese Information Processing,2018,32(11):103-111.(in Chinese)刘家骅,韦琬,陈灏,等.基于多篇章多答案的阅读理解系统[J].中文信息学报,2018,32(11):103-111.
[21] WANG Zhiqiang,LI Ru,LIANG Jiye.Research on question answering for reading comprehension based on Chinese discourse frame semantic parsing[J].Chinese Journal of Computers,2016,39(4):155-167.(in Chinese)王智强,李茹,梁吉业.基于汉语篇章框架语义分析的阅读理解问答研究[J].计算机学报,2016,39(4):155-167.
[22] PENNINGTON J,SOCHER R,MANNING C.GloVe:global vectors for word representation[C]//Proceedings of IEEE Conference on Empirical Methods in Natural Language Processing.Washington D.C.,USA:IEEE Press,2014:1532-1543.
[23] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks[C]//Proceedings of IEEE AIPS'12.Washington D.C.,USA:IEEE Press,2012:1097-1105.
[24] CHEN D,FISCH A,WESTON J,et al.Reading Wikipedia to answer open-domain questions[EB/OL].[2019-08-10].https://arxiv.org/abs/1704.00051.
[25] HE Kaiming,ZHANG Xiangyu,REN Shaoqing,et al.Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2016:770-778.
[26] KINGMA D P,BA J.Adam:a method for stochastic optimization[EB/OL].[2019-08-10].http://arxiv.org/abs/1412.6980.

选择文件类型/文献管理软件名称

选择包含的内容