作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (8): 104-110. doi: 10.19678/j.issn.1000-3428.0065461

• 人工智能与模式识别 • 上一篇    下一篇

基于证据句与图卷积网络的文档级关系抽取

马建红, 龚天, 姚爽*   

  1. 河北工业大学 人工智能与数据科学学院, 天津 300401
  • 收稿日期:2022-08-08 出版日期:2023-08-15 发布日期:2022-12-13
  • 通讯作者: 姚爽
  • 作者简介:

    马建红(1965—),女,教授、博士,主研方向为自然语言处理、知识图谱

    龚天,硕士研究生

  • 基金资助:
    科技部创新方法工作专项(2019IM020300)

Document-Level Relation Extraction Based on Evidential Sentences and Graph Convolutional Network

Jianhong MA, Tian GONG, Shuang YAO*   

  1. School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
  • Received:2022-08-08 Online:2023-08-15 Published:2022-12-13
  • Contact: Shuang YAO

摘要:

针对基于图卷积网络的文档级关系抽取模型存在未对邻居节点贡献度加以区分及句子噪声的问题,在将证据句融入图卷积网络进行消息传播的基础上,构建一种改进的文档级关系抽取模型。基于启发式路径得到包含证据句的路径信息,在包含证据句的路径信息基础上进行关系抽取,统计所有样本路径中的句子占比,并在异构图中融入证据句路径信息进行相似度计算,得到与样本相关的3句证据句。在证据句信息的基础上对不同类型的边根据贡献度区分规则赋予相应权重,并使用图卷积操作对节点信息进行二次增强,最终实现文档级关系抽取。在DocRED数据集上的实验结果表明,该模型的F1值达到56.96%,相比于Paths、Hin-Glove等基线模型提升了0.42~13.51个百分点,验证了在文档图中融入证据句信息对于提升文档级关系抽取模型性能的有效性。

关键词: 文档级关系抽取, 图卷积网络, 证据句, 异构图, 权重

Abstract:

In document-level relation extraction models based on a Graph Convolutional Network(GCN), the contribution of neighboring nodes and sentence noise cannot be distinguished. To address this issue, an improved document-level relation extraction model is built, whereby evidential sentences are integrated into the GCN for message propagation. Based on heuristic paths, path information containing evidential sentences is obtained to extract relations. The proportion of sentences in all sample paths is counted, and evidential sentence path information is integrated into heterogeneous graphs for similarity calculations to obtain three evidential sentences related to the samples. The evidential sentence information is subsequently used to assign corresponding weights to different types of edges according to contribution differentiation rules. A graph convolution operation is used to enhance the node information twice, ultimately achieving document-level relation extraction. The experimental results on the Document-level Relation Extraction Dataset(DocRED) show that the F1 value of the model reaches 56.96%, which is 0.42-13.51 percentage points higher than those of models such as Paths and Hin-Glove. This verifies the effectiveness of incorporating evidential sentence information into document graphs to improve the performance of document-level relation extraction models.

Key words: document-level relation extraction, Graph Convolutional Network(GCN), evidential sentence, heterogeneous graph, weight