作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (7): 89-96. doi: 10.19678/j.issn.1000-3428.0061496

• 人工智能与模式识别 • 上一篇    下一篇

融合因果关系表征的阅读理解因果关系类选项判断

普瑞丽1, 王元龙1, 李茹1,2   

  1. 1. 山西大学 计算机与信息技术学院, 太原 030006;
    2. 山西大学 计算智能与中文信息处理教育部重点实验室, 太原 030006
  • 收稿日期:2021-04-28 修回日期:2021-07-19 出版日期:2022-07-15 发布日期:2022-07-12
  • 作者简介:普瑞丽(1997—),女,硕士研究生,主研方向为自然语言处理;王元龙(通信作者),副教授、博士;李茹,教授、博士。
  • 基金资助:
    国家重点研发计划重点专项“基于大数据的类人智能关键技术与系统”(2018YFB1005103);国家自然科学基金“面向汉语篇章语义分析的框架推理技术研究”(61772324);国家自然科学基金“基于事件的图文数据阅读理解关键技术研究”(61806117)。

Judgement of Causality Options in Reading Comprehension Incorporating Representation of Causality

PU Ruili1, WANG Yuanlong1, LI Ru1,2   

  1. 1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China;
    2. Key Laboratory of Computation Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China
  • Received:2021-04-28 Revised:2021-07-19 Online:2022-07-15 Published:2022-07-12

摘要: 阅读理解因果关系类选项是指存在因果线索词的选项,此类选项需要根据原文中的因果关系表征进行作答。基于高考阅读理解任务构建因果关系网络,提出融合因果关系表征的因果关系类选项判断方法。采用模式匹配方法抽取原文的因果句对,根据文章因果句对抽取出因果关系词对,并通过点互信息计算因果关系词对之间的因果关联强度,从而构建因果关系网络来表征原文的因果关系。在此基础上,将因果关系表征融入到BERT模型中,预测因果关系选项和原文是否一致。同时,根据高考阅读理解大纲结合语料库发现错误类型分为因果颠倒、强加因果、偷换原因或结果、其他类型等4类,根据每一种错误类型的特点结合预测结果确定选项的错误类型,并提供一个错误解释,以增强方法的可解释性。选用近15年全国高考试题及模拟题中的4 071个科技类阅读理解因果选项进行实验,结果显示F1值达到62.09%,验证了该方法的有效性。

关键词: 高考阅读理解, 因果关系类选项, 因果关系网络, 因果关系表征, 可解释性

Abstract: In reading comprehension task, causality options are those with causal cues, which need to be answered according to the representation of causality in the original text.This paper constructs a causality network based on college entrance examination reading comprehension, and proposes a judgment method for the causality options based on the representation of causality.Firstly, a method based on pattern matching is used to extract the causal sentence pairs from the original text, then the causal word pairs are extracted from the causal sentence pairs of the original text, and the strength of the causality between the causal word pairs is calculated through point mutual information, so as to construct a causality network to represent the causality of the original text.On this basis, the causal representation is incorporated into the BERT model to predict whether the causal choice is consistent with the original text.According to the college entrance examination reading comprehension syllabus combined with the corpus, the error types are divided into four types:reversed cause and effect, imposed cause and effect, substitution of cause or result, and other types.According to the characteristics of each error type and the predicted results, the error type of the option is determined, and an error explanation is provided to enhance the interpretability of the method.In total, 4 071 scientific and technological reading comprehension questions and simulation questions from the last 15 years are tested.The results show that the F1 value reaches 62.09%, verifying the effectiveness of the proposed method.

Key words: college entrance examination reading comprehension, causality options, causality network, representation of causality, interpretability

中图分类号: